From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from scc-mailout-kit-01.scc.kit.edu (scc-mailout-kit-01.scc.kit.edu [129.13.231.81]) by fantadrom.bsd.lv (OpenSMTPD) with ESMTP id 4dd076b2 for ; Sat, 9 Jul 2016 11:12:55 -0500 (EST) Received: from asta-nat.asta.uni-karlsruhe.de ([172.22.63.82] helo=hekate.usta.de) by scc-mailout-kit-01.scc.kit.edu with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (envelope-from ) id 1bLus8-00079Z-8f; Sat, 09 Jul 2016 18:12:54 +0200 Received: from donnerwolke.usta.de ([172.24.96.3]) by hekate.usta.de with esmtp (Exim 4.77) (envelope-from ) id 1bLus7-0005dO-Uw; Sat, 09 Jul 2016 18:12:51 +0200 Received: from athene.usta.de ([172.24.96.10]) by donnerwolke.usta.de with esmtp (Exim 4.84_2) (envelope-from ) id 1bLus7-00051u-Pj; Sat, 09 Jul 2016 18:12:51 +0200 Received: from localhost (athene.usta.de [local]) by athene.usta.de (OpenSMTPD) with ESMTPA id e271dce0; Sat, 9 Jul 2016 18:12:51 +0200 (CEST) Date: Sat, 9 Jul 2016 18:12:51 +0200 From: Ingo Schwarze To: Baptiste Daroussin Cc: tech@mdocml.bsd.lv Subject: Re: New db format Message-ID: <20160709161251.GD6629@athene.usta.de> References: <20160709131158.GA6629@athene.usta.de> <20160709133936.tr4zsiekmrbfuav2@ivaldir.etoilebsd.net> <97865.1468074718@CATHET.us> <20160709152926.42bre6tittzmpm2g@ivaldir.etoilebsd.net> X-Mailinglist: mdocml-tech Reply-To: tech@mdocml.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160709152926.42bre6tittzmpm2g@ivaldir.etoilebsd.net> User-Agent: Mutt/1.6.2 (2016-07-01) Hi Baptiste, Baptiste Daroussin FreeBSD wrote on Sat, Jul 09, 2016 at 05:29:27PM +0200: > Beside minor portability issues (that I will provide fixes for once > this patch is official) You can already provide them now if you want to, and i'll include them in my work-in-progress version and testing. > I can say it works fine on FreeBSD :) > Tested on FreeBSD 12-CURRENT: Great to hear that. My main worry is that i screwed alignment somewhere. Did you test on any machines with unusual alignment requirements? Or is keeping int32_t aligned to 4-byte boundaries guaranteed to be safe everywhere? > The 3 main "bad" feedbacks I got about mandoc was: > - time to create the db was long with sqlite with the new code > it is longer: does from 4s to 11s in general for the base manpages > but I do not consider that as a big issue given it is not run that > often and the code is quite new and there are probably room of > improvements The SQLite version was optimized with gprof(1). I can try again for the new code. But mandoc's makewhatis(8) will always remain much slower than traditional makewhatis(8) because it parses the complete manuals rather than just the NAME sections. > - size of the db (for embedded). With the new code: > 3.8M /usr/share/man/mandoc.db > 1.6M /usr/share/man/mandoc.new.db > > This is a big improvement > > - apropos can be very slow before showing anything, with the new code, > it is instant! > old 'apropos ls': > 8.67 real 3.63 user 5.03 sys That is surprising. Yes, mandoc's SQLite based apropos(1) is slower than the traditional BSD apropos(1) - no big surprise given the much larger database. But i have never seen it to be *that* slow on OpenBSD, not even for complex queries accessing multiple macro keys and using -a and -o; certainly not for a plain Nm,Nd access like 'apropos ls' - that was always reasonably fast. But now isn't the time for making apropos-1.13 faster any longer... :-) schwarze@fantadrom $ dmesg OpenBSD 5.8-stable (GENERIC) #1: Fri Oct 30 18:55:29 EST 2015 cpu0: Intel(R) Pentium(R) 4 CPU 2.40GHz ("GenuineIntel" 686-class) 2.41 GHz real mem = 3203874816 (3055MB) [...] schwarze@fantadrom $ time apropos ls [...] 0m00.19s real 0m00.10s user 0m00.08s system schwarze@man $ dmesg OpenBSD 5.9 (GENERIC.MP) #1888: Fri Feb 26 01:20:19 MST 2016 cpu0: Intel(R) Xeon(R) CPU E31220 @ 3.10GHz, 3093.47 MHz cpu1: Intel(R) Xeon(R) CPU E31220 @ 3.10GHz, 3092.98 MHz cpu2: Intel(R) Xeon(R) CPU E31220 @ 3.10GHz, 3092.98 MHz cpu3: Intel(R) Xeon(R) CPU E31220 @ 3.10GHz, 3092.98 MHz real mem = 8541536256 (8145MB) [...] schwarze@man $ time apropos ls [...] 0m00.02s real 0m00.01s user 0m00.01s system schwarze@isnote $ dmesg OpenBSD 6.0-beta (GENERIC.MP) #1875: Sun Jun 19 11:51:07 MDT 2016 cpu0: Genuine Intel(R) CPU T2300 @ 1.66GHz ("GenuineIntel" 686-class) 1.67 GHz cpu1: Genuine Intel(R) CPU T2300 @ 1.66GHz ("GenuineIntel" 686-class) 1.67 GHz real mem = 3211083776 (3062MB) [...] schwarze@isnote $ time apropos ls [...] 0m00.22s real 0m00.09s user 0m00.07s system > new 'apropos ls': > 0.02 real 0.02 user 0.00 sys > > I notice some memory corruption in the ouput of apropos, > I haven't dig into it yet That might already be fixed; i just uploaded the current version of the patch here: http://mdocml.bsd.lv/snapshots/mdocml-1.14.0.06.patch http://mdocml.bsd.lv/snapshots/mdocml-1.14.0.06.regress.tgz http://mdocml.bsd.lv/snapshots/mdocml-1.14.0.06.log.txt THIS IS NOT A RELEASE AND NOT INTENDED FOR PRODUCTION. This code is likely to be still buggy. > Great work, I'm eager to see that code in! Thanks for your support! Yours, Ingo -- To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv