From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/78516 Path: news.gmane.org!not-for-mail From: Ted Zlatanov Newsgroups: gmane.emacs.gnus.general Subject: Re: Splitting based on character sets Date: Thu, 14 Apr 2011 09:54:51 -0500 Organization: =?utf-8?B?0KLQtdC+0LTQvtGAINCX0LvQsNGC0LDQvdC+0LI=?= @ Cienfuegos Message-ID: <87d3koanno.fsf@lifelogs.com> References: <87y63qb0b9.fsf@lifelogs.com> <87vcyjo1pl.fsf@topper.koldfront.dk> <87lizfxu7r.fsf@lifelogs.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: dough.gmane.org 1302792959 20747 80.91.229.12 (14 Apr 2011 14:55:59 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Thu, 14 Apr 2011 14:55:59 +0000 (UTC) To: ding@gnus.org Original-X-From: ding-owner+M26819@lists.math.uh.edu Thu Apr 14 16:55:54 2011 Return-path: Envelope-to: ding-account@gmane.org Original-Received: from util0.math.uh.edu ([129.7.128.18]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1QANxa-00079a-5K for ding-account@gmane.org; Thu, 14 Apr 2011 16:55:54 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by util0.math.uh.edu with smtp (Exim 4.63) (envelope-from ) id 1QANwu-0006as-CF; Thu, 14 Apr 2011 09:55:12 -0500 Original-Received: from mx2.math.uh.edu ([129.7.128.33]) by util0.math.uh.edu with esmtps (TLSv1:AES256-SHA:256) (Exim 4.63) (envelope-from ) id 1QANwr-0006ae-MP for ding@lists.math.uh.edu; Thu, 14 Apr 2011 09:55:09 -0500 Original-Received: from quimby.gnus.org ([80.91.231.51]) by mx2.math.uh.edu with esmtps (TLSv1:AES256-SHA:256) (Exim 4.72) (envelope-from ) id 1QANwp-0007yo-VK for ding@lists.math.uh.edu; Thu, 14 Apr 2011 09:55:09 -0500 Original-Received: from lo.gmane.org ([80.91.229.12]) by quimby.gnus.org with esmtp (Exim 4.72) (envelope-from ) id 1QANwm-0008Va-3o for ding@gnus.org; Thu, 14 Apr 2011 16:55:04 +0200 Original-Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1QANwk-0006VL-KU for ding@gnus.org; Thu, 14 Apr 2011 16:55:02 +0200 Original-Received: from 38.98.147.130 ([38.98.147.130]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 14 Apr 2011 16:55:02 +0200 Original-Received: from tzz by 38.98.147.130 with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 14 Apr 2011 16:55:02 +0200 X-Injected-Via-Gmane: http://gmane.org/ Original-Lines: 23 Original-X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: 38.98.147.130 X-Face: bd.DQ~'29fIs`T_%O%C\g%6jW)yi[zuz6;d4V0`@y-~$#3P_Ng{@m+e4o<4P'#(_GJQ%TT= D}[Ep*b!\e,fBZ'j_+#"Ps?s2!4H2-Y"sx" User-Agent: Gnus/5.110016 (No Gnus v0.16) Emacs/24.0.50 (gnu/linux) Cancel-Lock: sha1:R4sTq6pN/XyI8laH7NzeI198OP0= X-Spam-Score: -0.7 (/) List-ID: Precedence: bulk Xref: news.gmane.org gmane.emacs.gnus.general:78516 Archived-At: On Thu, 14 Apr 2011 10:28:24 +0200 David Engster wrote: DE> I've long tried to find a single, pure black-box-machine-learning DE> spam detection which could rival Spamassassin, main reason being that SA DE> can be such a memory hog on smaller servers. But I always came back. Yeah, the memory usage killed SA for me. With 1-10 users it's OK, but for a large mail server (where I worked at the time) it was not usable in my testing. When I stopped using SA there I also stopped using it for myself. DE> But I guess what Lars originally asked was not to classify mails in DE> Russian, but to just classify them as spam, since he doesn't get DE> Russian ham. Yeah, I was thinking sort of sideways to his question. Sorry. DE> For this, one can use the Textcat plugin from SA, which will try to DE> guess the language of the mail and include a X-Language header. Yeah, that should work. That's really useful, thanks for mentioning it. Ted