From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/9228 Path: main.gmane.org!not-for-mail From: David Moore Newsgroups: gmane.emacs.gnus.general Subject: Re: Red Gnus v0.76 is released Date: 16 Dec 1996 22:26:18 -0800 Sender: dmoore@sdnp5.ucsd.edu Message-ID: References: NNTP-Posting-Host: coloc-standby.netfonds.no X-Trace: main.gmane.org 1035149284 17188 80.91.224.250 (20 Oct 2002 21:28:04 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Sun, 20 Oct 2002 21:28:04 +0000 (UTC) Return-Path: Original-Received: from ifi.uio.no (0@ifi.uio.no [129.240.64.2]) by deanna.miranova.com (8.8.4/8.8.4) with SMTP id WAA24408 for ; Mon, 16 Dec 1996 22:40:22 -0800 Original-Received: from UCSD.EDU (mailbox2.ucsd.edu [132.239.1.54]) by ifi.uio.no with ESMTP (8.6.11/ifi2.4) id for ; Tue, 17 Dec 1996 07:28:04 +0100 Original-Received: from sdnp5.ucsd.edu (sdnp5.ucsd.edu [132.239.79.10]) by UCSD.EDU (8.8.3/8.6.9) with SMTP id WAA26686 for ; Mon, 16 Dec 1996 22:28:03 -0800 (PST) Original-Received: by sdnp5.ucsd.edu (SMI-8.6/SMI-SVR4) id WAA28499; Mon, 16 Dec 1996 22:26:19 -0800 Original-To: "(ding) Gnus Mailing List" X-Face: "oX;zS#-JU$-,WKSzG.1gGE]x^cIg!hW.dq>.f6pzS^A+(k!T|M:}5{_%>Io<>L&{hO7W4cicOQ|>/lZ1G(m%7iaCf,6Qgk0%%Bz7b2-W3jd0m_UG\Y;?]}4s0O-U)uox>P3JN)9cm]O\@,vy2e{`3pb!"pqmRy3peB90*2L In-Reply-To: Jim Meyering's message of 16 Dec 1996 22:36:33 -0600 Original-Lines: 45 X-Mailer: Red Gnus v0.75/XEmacs 19.14 Xref: main.gmane.org gmane.emacs.gnus.general:9228 X-Report-Spam: http://spam.gmane.org/gmane.emacs.gnus.general:9228 Jim Meyering writes: > Lars Magne Ingebrigtsen writes: > | Thu Dec 12 18:18:11 1996 David Moore > | > | * gnus-util.el (gnus-create-hash-size): Power of 2 hashtables can > | be _significantly_ faster than 2^x-1 tables on many risc > | machines. Any gains of 2^x-1 are comparably small on other > | machines. > > Hi. > > Have you considered any other implications of changing the table size > to be a power of 2? It's pretty generally accepted that hash tables > get better key distributions (fewer collisions) when the table size > is a prime -- or has relatively few factors. A bad table size (and > powers of 2 are notoriously bad) can make the normally-very-efficient > (O(1)) insert/delete/lookup operations abyssymally slow O(n) ones. I've looked into this yes, and it's not as large of a problem with bucket hashs as it is with quadratic rehashing, typically. With the ~12,864 newgroups on my site placed in both 16,384 and 16,383 size hashtable using the xemacs hash function the number, the collision rate was virtually the same (less than 5% fewer buckets in the power of 2, although I can't say much about the chain size from lisp). However, on the sparc and some other risc processors, over 75% of the cost to search/insert into the hashtable is in the modulo operation. > I suspect that choosing table size == 2^N (rather than the > usually-good and sometimes-even-prime 2^N - 1) would degrade the > efficiency due to the increased frequency of collisions. It tends to depend on your hash function and input data, although I do usually use a rule of thumb of going to a larger table when using power of two. But again it only helps when your hash function is still doing something reasonable. I didn't bother doubling the hashtable size here, because I was mostly after reasonable O(1) operation, rather than 1 hit accesses. I didn't think a 128k table in memory was worth it, since the 64k table was doing fine. -- David Moore | Computer Systems Lab __o UCSD Dept. Computer Science - 0114 | Work: (619) 534-8604 _ \<,_ La Jolla, CA 92093-0114 | Fax: (619) 534-1445 (_)/ (_) | In a cloud bones of steel.