From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <32a012bc0c8ebc771a2c2c2dab58ef20@snellwilcox.com> From: plan9fans@ntlworld.nospam.com To: 9fans@cse.psu.edu Subject: Re: [9fans] tactic MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Date: Fri, 2 Apr 2004 10:25:14 +0100 Topicbox-Message-UUID: 4eaa618e-eacd-11e9-9e20-41e7f4b1d025 Hi, For spam filtering I use the Plan9 auto one time certificate system (please send me back an email with this magic string in the subject line etc.) It does increase traffic but it works very well. Random thoughts on Baysian filtering: I don't understand why none of the filters seem to have an LRU policy to keep the database size down; perhaps they do now I am a bit out of touch. I am impressed with CRM114's sort-of Markov chain approach which looks much more likely to succeed. I keep meaning to try it... I have started to get emails consisting of text rendered into an image, and no textural content at all. This is not as much a problem to filter as it might appear, there are image fingerprinting techniques which are very robust to size, resolution, and colour changes. These fingerprints could then be used another degree of freedom in a baysian framework. -Steve