From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/81646 Path: news.gmane.org!not-for-mail From: Jan Tatarik Newsgroups: gmane.emacs.gnus.general Subject: Re: Scoring on basee64 encoded message body Date: Fri, 23 Mar 2012 13:11:24 +0100 Message-ID: <5n5x2r8virwm8j.fsf@nb-jtatarik2.xing.hh> References: <5n5x2rvcm5zibm.fsf@nb-jtatarik2.xing.hh> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Trace: dough.gmane.org 1332504802 17101 80.91.229.3 (23 Mar 2012 12:13:22 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Fri, 23 Mar 2012 12:13:22 +0000 (UTC) To: ding@gnus.org Original-X-From: ding-owner+M29926@lists.math.uh.edu Fri Mar 23 13:13:18 2012 Return-path: Envelope-to: ding-account@gmane.org Original-Received: from util0.math.uh.edu ([129.7.128.18]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1SB3Mj-0007xi-TG for ding-account@gmane.org; Fri, 23 Mar 2012 13:13:10 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by util0.math.uh.edu with smtp (Exim 4.63) (envelope-from ) id 1SB3LP-0003XL-G7; Fri, 23 Mar 2012 07:11:47 -0500 Original-Received: from mx2.math.uh.edu ([129.7.128.33]) by util0.math.uh.edu with esmtps (TLSv1:AES256-SHA:256) (Exim 4.63) (envelope-from ) id 1SB3LN-0003X6-DC for ding@lists.math.uh.edu; Fri, 23 Mar 2012 07:11:45 -0500 Original-Received: from quimby.gnus.org ([80.91.231.51]) by mx2.math.uh.edu with esmtps (TLSv1:AES256-SHA:256) (Exim 4.76) (envelope-from ) id 1SB3LC-0003xY-TJ for ding@lists.math.uh.edu; Fri, 23 Mar 2012 07:11:44 -0500 Original-Received: from mail-we0-f172.google.com ([74.125.82.172]) by quimby.gnus.org with esmtp (Exim 4.72) (envelope-from ) id 1SB3LB-0007Gx-1c for ding@gnus.org; Fri, 23 Mar 2012 13:11:33 +0100 Original-Received: by werb10 with SMTP id b10so3817934wer.17 for ; Fri, 23 Mar 2012 05:11:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:subject:references:date:in-reply-to:message-id:user-agent :mime-version:content-type; bh=7fpDBq216Vtsx/5+2rb7J919ayfiGRFWNLfHRqAA5jE=; b=L5L9p2nmxIYz5YpQNQI3BXycHKrwSaCzCszQ0PWxvlcW/lLxSG7YXFbB6aRTbSw+Gj rWoFJW5+Hee7KyN97aw+0vYEE2twBBa8GjEIzl4swRUjRCJjJ4A+Z/9MN6LJim+8Cc+J iCCKRgm+YPm41FcR8zQB89+tjKcqfTbYLSSqFYRKyQV/sAvGqwqJ/ijSu6Sx3LwWJU9u d3fVxYEgWDY2RO3adA1w4FZ8xuazEdxeQ+rPvTIb0fezv7tBVWYHJZg+4IV8ArlDdB0Q Ke68n5N9YI8i9qXerRIe3q/rzy56axY9JkT9n+P4Pk/soQ7dkp9q5MXKz/szljT11dBj v20w== Original-Received: by 10.180.79.72 with SMTP id h8mr6152712wix.1.1332504687471; Fri, 23 Mar 2012 05:11:27 -0700 (PDT) Original-Received: from nb-jtatarik2.xing.hh (office.xing.com. [82.112.107.65]) by mx.google.com with ESMTPS id bx13sm13343509wib.10.2012.03.23.05.11.25 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 23 Mar 2012 05:11:26 -0700 (PDT) In-Reply-To: (Lars Magne Ingebrigtsen's message of "Thu, 22 Mar 2012 21:38:51 +0100") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.94 (gnu/linux) X-Spam-Score: -3.0 (---) List-ID: Precedence: bulk Xref: news.gmane.org gmane.emacs.gnus.general:81646 Archived-At: --=-=-= Content-Type: text/plain On Thu, Mar 22 2012, Lars Magne Ingebrigtsen Lars Magne Ingebrigtsen wrote: > Jan Tatarik writes: >> This better? > Yes, that looks better, but it should probably just call > `mm-decode-content-transfer-encoding' instead, I think? Like this? --=-=-= Content-Type: text/x-diff Content-Disposition: inline; filename=decode-message-before-scoring-on-body.diff Content-Description: decode message body before scoring on it diff --git a/lisp/gnus-logic.el b/lisp/gnus-logic.el index 954295438c953c2500b9c1959a49e52312cc9653..9216f5699ce1ed9a8c39dd03257a17885f6e8490 100644 --- a/lisp/gnus-logic.el +++ b/lisp/gnus-logic.el @@ -181,8 +181,10 @@ (with-current-buffer nntp-server-buffer (let* ((request-func (cond ((string= "head" header) 'gnus-request-head) + ;; We need to peek at the headers to detect the + ;; content encoding ((string= "body" header) - 'gnus-request-body) + 'gnus-request-article) (t 'gnus-request-article))) ofunc article) ;; Not all backends support partial fetching. In that case, we @@ -196,6 +198,20 @@ (gnus-message 7 "Scoring article %s..." article) (when (funcall request-func article gnus-newsgroup-name) (goto-char (point-min)) + ;; Searching base64/qp-encoded message body produces more + ;; satisfactory results if we decode the message first + (unless (or (eq ofunc 'gnus-request-head) + (eq request-func 'gnus-request-head)) + (let ((encoding (gnus-fetch-field "content-transfer-encoding"))) + (when encoding + (save-excursion + (save-restriction + ;; narrow to body + (narrow-to-region + (or (search-forward "\n\n" nil t) (point)) + (point-max)) + (mm-decode-content-transfer-encoding + (intern (downcase encoding)))))))) ;; If just parts of the article is to be searched and the ;; backend didn't support partial fetching, we just narrow to ;; the relevant parts. diff --git a/lisp/gnus-score.el b/lisp/gnus-score.el index f86b6f837a70ce54b06668187821fe57c3f80f4c..322aed78fa374b873fb8604482adead77362c5be 100644 --- a/lisp/gnus-score.el +++ b/lisp/gnus-score.el @@ -1752,8 +1752,10 @@ score in `gnus-newsgroup-scored' by SCORE." (all-scores scores) (request-func (cond ((string= "head" header) 'gnus-request-head) + ;; We need to peek at the headers to detect + ;; the content encoding ((string= "body" header) - 'gnus-request-body) + 'gnus-request-article) (t 'gnus-request-article))) entries alist ofunc article last) (when articles @@ -1773,6 +1775,20 @@ score in `gnus-newsgroup-scored' by SCORE." (widen) (when (funcall request-func article gnus-newsgroup-name) (goto-char (point-min)) + ;; Searching base64/qp-encoded message body produces more + ;; satisfactory results if we decode the message first + (unless (or (eq ofunc 'gnus-request-head) + (eq request-func 'gnus-request-head)) + (let ((encoding (gnus-fetch-field "content-transfer-encoding"))) + (when encoding + (save-excursion + (save-restriction + ;; narrow to body + (narrow-to-region + (or (search-forward "\n\n" nil t) (point)) + (point-max)) + (mm-decode-content-transfer-encoding + (intern (downcase encoding)))))))) ;; If just parts of the article is to be searched, but the ;; backend didn't support partial fetching, we just narrow ;; to the relevant parts. --=-=-=--