From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/81748 Path: news.gmane.org!not-for-mail From: Jan Tatarik Newsgroups: gmane.emacs.gnus.general Subject: Re: Scoring on basee64 encoded message body Date: Fri, 13 Apr 2012 00:58:59 +0200 Message-ID: <5n5x2riph4d0b0.fsf@nb-jtatarik2.xing.hh> References: <5n5x2rvcm5zibm.fsf@nb-jtatarik2.xing.hh> <5n5x2r8virwm8j.fsf@nb-jtatarik2.xing.hh> <5n5x2rehrudpuv.fsf@nb-jtatarik2.xing.hh> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Trace: dough.gmane.org 1334271607 2885 80.91.229.3 (12 Apr 2012 23:00:07 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Thu, 12 Apr 2012 23:00:07 +0000 (UTC) To: ding@gnus.org Original-X-From: ding-owner+M30026@lists.math.uh.edu Fri Apr 13 01:00:07 2012 Return-path: Envelope-to: ding-account@gmane.org Original-Received: from util0.math.uh.edu ([129.7.128.18]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1SISzm-0003rw-JQ for ding-account@gmane.org; Fri, 13 Apr 2012 01:00:06 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by util0.math.uh.edu with smtp (Exim 4.63) (envelope-from ) id 1SISyv-0003cF-8x; Thu, 12 Apr 2012 17:59:13 -0500 Original-Received: from mx2.math.uh.edu ([129.7.128.33]) by util0.math.uh.edu with esmtps (TLSv1:AES256-SHA:256) (Exim 4.63) (envelope-from ) id 1SISyt-0003c5-S7 for ding@lists.math.uh.edu; Thu, 12 Apr 2012 17:59:11 -0500 Original-Received: from quimby.gnus.org ([80.91.231.51]) by mx2.math.uh.edu with esmtps (TLSv1:AES256-SHA:256) (Exim 4.76) (envelope-from ) id 1SISys-0001Bq-CN for ding@lists.math.uh.edu; Thu, 12 Apr 2012 17:59:11 -0500 Original-Received: from mail-bk0-f44.google.com ([209.85.214.44]) by quimby.gnus.org with esmtp (Exim 4.72) (envelope-from ) id 1SISyq-0001Is-Rq for ding@gnus.org; Fri, 13 Apr 2012 00:59:08 +0200 Original-Received: by bkuw5 with SMTP id w5so3070757bku.17 for ; Thu, 12 Apr 2012 15:59:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:subject:references:date:in-reply-to:message-id:user-agent :mime-version:content-type; bh=IO05TOT2ZZ2U+toU81hv3eBWfqjWQfdBKplkYjmGo8o=; b=RGcRuOG3D6Immgc74/28mtbl3cFUB8PEP8TzcVgFIvFGHmb+mS7fz+5HmL0rN4dZgX weYqRuBLlazxeyZFw4WZ/bj8AitAE232n4uwP8FmarHnQtw6eZEXFB0/F2kJfu3Bndhy 4XsEWfnWSHEJCYJYKgOEF7ERcHwFZ972YU51/pdp6JOtTpFFyODjbgAd2LQcsXXeu6GV uzSFRNa+AmQr4pIhBH65BdXd+Wl8weZfnmsjodusoQ7i+g4sReDDGy/KKcpo0KCDz+Em DhS5lYP4pkUDVLzDwUQ5VvpeanIimW2ezwip2RtdG7HqvhsPPdexaHj1i/MzHGsetb1w xhMw== Original-Received: by 10.205.139.67 with SMTP id iv3mr24813bkc.8.1334271543228; Thu, 12 Apr 2012 15:59:03 -0700 (PDT) Original-Received: from nb-jtatarik2.xing.hh (c211147.adsl.hansenet.de. [213.39.211.147]) by mx.google.com with ESMTPS id z17sm13309273bkw.12.2012.04.12.15.59.01 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 12 Apr 2012 15:59:02 -0700 (PDT) In-Reply-To: (Lars Magne Ingebrigtsen's message of "Thu, 12 Apr 2012 20:45:49 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.1.50 (gnu/linux) X-Spam-Score: -3.0 (---) List-ID: Precedence: bulk Xref: news.gmane.org gmane.emacs.gnus.general:81748 Archived-At: --=-=-= Content-Type: text/plain On Thu, Apr 12 2012, Lars Magne Ingebrigtsen Lars Magne Ingebrigtsen wrote: >> - run mm-dissect-buffer on the message body (any idea which args would >> be appropriate?) > NO-STRICT-MIME, I think. >> - for multipart messages, pick the handles with text/* type, run them >> through their respective mm-inline-* function as defined in >> mm-inline-media-tests >> - score on all the decoded text parts >> Is this the way to go? > Yup; sounds good. And here is the new patch. --=-=-= Content-Type: text/x-diff Content-Disposition: attachment; filename=gnus-score-decode-text-parts.diff Content-Description: decode mm messages when scoring on body diff --git a/lisp/gnus-logic.el b/lisp/gnus-logic.el index 954295438c953c2500b9c1959a49e52312cc9653..38442a406dd6fb6ef47cd468248618e68337a26a 100644 --- a/lisp/gnus-logic.el +++ b/lisp/gnus-logic.el @@ -181,8 +181,10 @@ (with-current-buffer nntp-server-buffer (let* ((request-func (cond ((string= "head" header) 'gnus-request-head) + ;; We need to peek at the headers to detect the + ;; content encoding ((string= "body" header) - 'gnus-request-body) + 'gnus-request-article) (t 'gnus-request-article))) ofunc article) ;; Not all backends support partial fetching. In that case, we @@ -196,6 +198,7 @@ (gnus-message 7 "Scoring article %s..." article) (when (funcall request-func article gnus-newsgroup-name) (goto-char (point-min)) + (gnus-score-decode-text-parts) ;; If just parts of the article is to be searched and the ;; backend didn't support partial fetching, we just narrow to ;; the relevant parts. diff --git a/lisp/gnus-score.el b/lisp/gnus-score.el index f86b6f837a70ce54b06668187821fe57c3f80f4c..003355dd2c91847241dc67263f83d26ae52920de 100644 --- a/lisp/gnus-score.el +++ b/lisp/gnus-score.el @@ -1736,6 +1736,24 @@ score in `gnus-newsgroup-scored' by SCORE." (setq entries rest))))) nil) +(defun gnus-score-decode-text-parts () + (let ((handles (mm-dissect-buffer t))) + (cond ((stringp (car handles)) (pop handles)) + ((and (bufferp (car handles)) + (stringp (car (mm-handle-type handles)))) + (setq handles (list handles)))) + + (save-excursion + (article-goto-body) + (delete-region (point) (point-max)) + (save-restriction + (narrow-to-region (point) (point)) + (mapc #'mm-display-inline + (remove-if-not + (lambda (handle) + (string-match "^text/" (mm-handle-media-type handle))) + handles)))))) + (defun gnus-score-body (scores header now expire &optional trace) (if gnus-agent-fetching nil @@ -1752,8 +1770,10 @@ score in `gnus-newsgroup-scored' by SCORE." (all-scores scores) (request-func (cond ((string= "head" header) 'gnus-request-head) + ;; We need to peek at the headers to detect + ;; the content encoding ((string= "body" header) - 'gnus-request-body) + 'gnus-request-article) (t 'gnus-request-article))) entries alist ofunc article last) (when articles @@ -1773,6 +1793,7 @@ score in `gnus-newsgroup-scored' by SCORE." (widen) (when (funcall request-func article gnus-newsgroup-name) (goto-char (point-min)) + (gnus-score-decode-text-parts) ;; If just parts of the article is to be searched, but the ;; backend didn't support partial fetching, we just narrow ;; to the relevant parts. --=-=-=--