From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/81985 Path: news.gmane.org!not-for-mail From: Jan Tatarik Newsgroups: gmane.emacs.gnus.general Subject: Re: Scoring on basee64 encoded message body Date: Thu, 28 Jun 2012 11:45:23 +0200 Message-ID: <5n5x2rhatvvkgc.fsf@nb-jtatarik2.xing.hh> References: <5n5x2rvcm5zibm.fsf@nb-jtatarik2.xing.hh> <5n5x2r8virwm8j.fsf@nb-jtatarik2.xing.hh> <5n5x2rehrudpuv.fsf@nb-jtatarik2.xing.hh> <5n5x2riph4d0b0.fsf@nb-jtatarik2.xing.hh> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Trace: dough.gmane.org 1340876837 23278 80.91.229.3 (28 Jun 2012 09:47:17 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Thu, 28 Jun 2012 09:47:17 +0000 (UTC) Cc: Lars Magne Ingebrigtsen To: ding@gnus.org Original-X-From: ding-owner+M30255@lists.math.uh.edu Thu Jun 28 11:47:16 2012 Return-path: Envelope-to: ding-account@gmane.org Original-Received: from util0.math.uh.edu ([129.7.128.18]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1SkBJf-0003dQ-JF for ding-account@gmane.org; Thu, 28 Jun 2012 11:47:12 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by util0.math.uh.edu with smtp (Exim 4.63) (envelope-from ) id 1SkBIC-0002vT-RV; Thu, 28 Jun 2012 04:45:40 -0500 Original-Received: from mx1.math.uh.edu ([129.7.128.32]) by util0.math.uh.edu with esmtps (TLSv1:AES256-SHA:256) (Exim 4.63) (envelope-from ) id 1SkBIA-0002vH-NC for ding@lists.math.uh.edu; Thu, 28 Jun 2012 04:45:38 -0500 Original-Received: from quimby.gnus.org ([80.91.231.51]) by mx1.math.uh.edu with esmtps (TLSv1:AES256-SHA:256) (Exim 4.76) (envelope-from ) id 1SkBI8-0008TU-Hs for ding@lists.math.uh.edu; Thu, 28 Jun 2012 04:45:38 -0500 Original-Received: from mail-bk0-f44.google.com ([209.85.214.44]) by quimby.gnus.org with esmtp (Exim 4.72) (envelope-from ) id 1SkBI6-0008Tk-8B; Thu, 28 Jun 2012 11:45:34 +0200 Original-Received: by bkty8 with SMTP id y8so2372365bkt.17 for ; Thu, 28 Jun 2012 02:45:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version:content-type; bh=Z3mxMrHx6DOskusqhMqRXPdok79U1BYGiRb6owJn+V8=; b=agCkQHchyGMr/1Qu1HGSO/kgCD61+j+t6FAGeB9O9JtNk+Zbao70VXFfZBIhznZ0HT tVepGSBzmwM0ziDJIsUeBbKyMcOUULY9ZyP9A0LjRvXZLwqTUDfDzWxjscglZDT+73U1 oSDYB0PglVc9sNtxybQymWVKKOcuX0cRsvjJ4Xbwlm3Py0y0CI7tUTmQCJPGDLIuYidH wl708ZbHZr1OXTm/Bz4g7L0WhfZ5ZsImbMux2phLPMJ6SdcLxvl7vv8XRa2+00kwweFs bsRl/+Uh9wR5H5uG71acnOdwxAk2zzHotpV65eaDZOJ4hFFBeqQC28q7HjS2AZqFfEj2 8WAQ== Original-Received: by 10.204.157.144 with SMTP id b16mr577861bkx.12.1340876728867; Thu, 28 Jun 2012 02:45:28 -0700 (PDT) Original-Received: from nb-jtatarik2.xing.hh.xing.hh (office.xing.com. [82.112.107.65]) by mx.google.com with ESMTPS id n5sm5395037bkv.14.2012.06.28.02.45.24 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 28 Jun 2012 02:45:26 -0700 (PDT) In-Reply-To: (Lars Magne Ingebrigtsen's message of "Sun, 10 Jun 2012 23:08:24 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.1.50 (gnu/linux) X-Spam-Score: -3.0 (---) List-ID: Precedence: bulk Xref: news.gmane.org gmane.emacs.gnus.general:81985 Archived-At: --=-=-= Content-Type: text/plain On Sun, Jun 10 2012, Lars Magne Ingebrigtsen Lars Magne Ingebrigtsen wrote: > Jan Tatarik writes: >> And here is the new patch. > Sorry for the extreme delays here. > The patch looks good. > I don't remember whether I asked you whether you had FSF copyright > assignments on file or not? I do now. --=-=-= Content-Type: text/x-diff Content-Disposition: inline; filename=score-by-body.diff Content-Description: score by body diff --git a/lisp/gnus-logic.el b/lisp/gnus-logic.el index 954295438c953c2500b9c1959a49e52312cc9653..a440b7799303ddf46674c35c20f5bba51388b50c 100644 --- a/lisp/gnus-logic.el +++ b/lisp/gnus-logic.el @@ -180,46 +180,51 @@ (setq header "article")) (with-current-buffer nntp-server-buffer (let* ((request-func (cond ((string= "head" header) - 'gnus-request-head) - ((string= "body" header) - 'gnus-request-body) - (t 'gnus-request-article))) - ofunc article) + 'gnus-request-head) + ;; We need to peek at the headers to detect the + ;; content encoding + ((string= "body" header) + 'gnus-request-article) + (t 'gnus-request-article))) + ofunc article handles) ;; Not all backends support partial fetching. In that case, we ;; just fetch the entire article. (unless (gnus-check-backend-function - (intern (concat "request-" header)) - gnus-newsgroup-name) - (setq ofunc request-func) - (setq request-func 'gnus-request-article)) + (intern (concat "request-" header)) + gnus-newsgroup-name) + (setq ofunc request-func) + (setq request-func 'gnus-request-article)) (setq article (mail-header-number gnus-advanced-headers)) (gnus-message 7 "Scoring article %s..." article) (when (funcall request-func article gnus-newsgroup-name) - (goto-char (point-min)) - ;; If just parts of the article is to be searched and the - ;; backend didn't support partial fetching, we just narrow to - ;; the relevant parts. - (when ofunc - (if (eq ofunc 'gnus-request-head) - (narrow-to-region - (point) - (or (search-forward "\n\n" nil t) (point-max))) - (narrow-to-region - (or (search-forward "\n\n" nil t) (point)) - (point-max)))) - (let* ((case-fold-search (not (eq (downcase (symbol-name type)) - (symbol-name type)))) - (search-func - (cond ((memq type '(r R regexp Regexp)) - 're-search-forward) - ((memq type '(s S string String)) - 'search-forward) - (t - (error "Invalid match type: %s" type))))) - (goto-char (point-min)) - (prog1 - (funcall search-func match nil t) - (widen))))))) + (when (string= "body" header) + (setq handles (gnus-score-decode-text-parts))) + (goto-char (point-min)) + ;; If just parts of the article is to be searched and the + ;; backend didn't support partial fetching, we just narrow to + ;; the relevant parts. + (when ofunc + (if (eq ofunc 'gnus-request-head) + (narrow-to-region + (point) + (or (search-forward "\n\n" nil t) (point-max))) + (narrow-to-region + (or (search-forward "\n\n" nil t) (point)) + (point-max)))) + (let* ((case-fold-search (not (eq (downcase (symbol-name type)) + (symbol-name type)))) + (search-func + (cond ((memq type '(r R regexp Regexp)) + 're-search-forward) + ((memq type '(s S string String)) + 'search-forward) + (t + (error "Invalid match type: %s" type))))) + (goto-char (point-min)) + (prog1 + (funcall search-func match nil t) + (widen))) + (when handles (mm-destroy-parts handles)))))) (provide 'gnus-logic) diff --git a/lisp/gnus-score.el b/lisp/gnus-score.el index f86b6f837a70ce54b06668187821fe57c3f80f4c..d50b3cc0cf02812ef32f53beaf85cd27171680d4 100644 --- a/lisp/gnus-score.el +++ b/lisp/gnus-score.el @@ -1736,105 +1736,140 @@ score in `gnus-newsgroup-scored' by SCORE." (setq entries rest))))) nil) +(defun gnus-score-decode-text-parts () + (labels ((mm-text-parts (handle) + (cond ((stringp (car handle)) + (let ((parts (mapcan 'mm-text-parts (cdr handle)))) + (if (equal "multipart/alternative" (car handle)) + ;; pick the first supported alternative + (list (car parts)) + parts))) + + ((bufferp (car handle)) + (when (string-match "^text/" (mm-handle-media-type handle)) + (list handle))) + + (t (mapcan 'mm-text-parts handle)))) + (my-mm-display-part (handle) + (when handle + (save-restriction + (narrow-to-region (point) (point)) + (mm-display-inline handle) + (goto-char (point-max)))))) + + (let (;(mm-text-html-renderer 'w3m-standalone) + (handles (mm-dissect-buffer t))) + (save-excursion + (article-goto-body) + (delete-region (point) (point-max)) + (mapc 'my-mm-display-part (mm-text-parts handles)) + handles)))) + (defun gnus-score-body (scores header now expire &optional trace) - (if gnus-agent-fetching - nil - (save-excursion - (setq gnus-scores-articles - (sort gnus-scores-articles - (lambda (a1 a2) - (< (mail-header-number (car a1)) - (mail-header-number (car a2)))))) - (set-buffer nntp-server-buffer) - (save-restriction - (let* ((buffer-read-only nil) - (articles gnus-scores-articles) - (all-scores scores) - (request-func (cond ((string= "head" header) - 'gnus-request-head) - ((string= "body" header) - 'gnus-request-body) - (t 'gnus-request-article))) - entries alist ofunc article last) - (when articles - (setq last (mail-header-number (caar (last articles)))) - ;; Not all backends support partial fetching. In that case, - ;; we just fetch the entire article. - (unless (gnus-check-backend-function - (and (string-match "^gnus-" (symbol-name request-func)) - (intern (substring (symbol-name request-func) - (match-end 0)))) - gnus-newsgroup-name) - (setq ofunc request-func) - (setq request-func 'gnus-request-article)) - (while articles - (setq article (mail-header-number (caar articles))) - (gnus-message 7 "Scoring article %s of %s..." article last) - (widen) - (when (funcall request-func article gnus-newsgroup-name) - (goto-char (point-min)) - ;; If just parts of the article is to be searched, but the - ;; backend didn't support partial fetching, we just narrow - ;; to the relevant parts. - (when ofunc - (if (eq ofunc 'gnus-request-head) - (narrow-to-region - (point) - (or (search-forward "\n\n" nil t) (point-max))) - (narrow-to-region - (or (search-forward "\n\n" nil t) (point)) - (point-max)))) - (setq scores all-scores) - ;; Find matches. - (while scores - (setq alist (pop scores) - entries (assoc header alist)) - (while (cdr entries) ;First entry is the header index. - (let* ((rest (cdr entries)) - (kill (car rest)) - (match (nth 0 kill)) - (type (or (nth 3 kill) 's)) - (score (or (nth 1 kill) - gnus-score-interactive-default-score)) - (date (nth 2 kill)) - (found nil) - (case-fold-search - (not (or (eq type 'R) (eq type 'S) - (eq type 'Regexp) (eq type 'String)))) - (search-func - (cond ((or (eq type 'r) (eq type 'R) - (eq type 'regexp) (eq type 'Regexp)) - 're-search-forward) - ((or (eq type 's) (eq type 'S) - (eq type 'string) (eq type 'String)) - 'search-forward) - (t - (error "Invalid match type: %s" type))))) - (goto-char (point-min)) - (when (funcall search-func match nil t) - ;; Found a match, update scores. - (setcdr (car articles) (+ score (cdar articles))) - (setq found t) - (when trace - (push - (cons (car-safe (rassq alist gnus-score-cache)) - kill) - gnus-score-trace))) - ;; Update expire date - (unless trace - (cond - ((null date)) ;Permanent entry. - ((and found gnus-update-score-entry-dates) - ;; Match, update date. - (gnus-score-set 'touched '(t) alist) - (setcar (nthcdr 2 kill) now)) - ((and expire (< date expire)) ;Old entry, remove. - (gnus-score-set 'touched '(t) alist) - (setcdr entries (cdr rest)) - (setq rest entries)))) - (setq entries rest))))) - (setq articles (cdr articles))))))) - nil)) + (if gnus-agent-fetching + nil + (save-excursion + (setq gnus-scores-articles + (sort gnus-scores-articles + (lambda (a1 a2) + (< (mail-header-number (car a1)) + (mail-header-number (car a2)))))) + (set-buffer nntp-server-buffer) + (save-restriction + (let* ((buffer-read-only nil) + (articles gnus-scores-articles) + (all-scores scores) + (request-func (cond ((string= "head" header) + 'gnus-request-head) + ;; We need to peek at the headers to detect + ;; the content encoding + ((string= "body" header) + 'gnus-request-article) + (t 'gnus-request-article))) + entries alist ofunc article last) + (when articles + (setq last (mail-header-number (caar (last articles)))) + ;; Not all backends support partial fetching. In that case, + ;; we just fetch the entire article. + (unless (gnus-check-backend-function + (and (string-match "^gnus-" (symbol-name request-func)) + (intern (substring (symbol-name request-func) + (match-end 0)))) + gnus-newsgroup-name) + (setq ofunc request-func) + (setq request-func 'gnus-request-article)) + (while articles + (setq article (mail-header-number (caar articles))) + (gnus-message 7 "Scoring article %s of %s..." article last) + (widen) + (let (handles) + (when (funcall request-func article gnus-newsgroup-name) + (when (string= "body" header) + (setq handles (gnus-score-decode-text-parts))) + (goto-char (point-min)) + ;; If just parts of the article is to be searched, but the + ;; backend didn't support partial fetching, we just narrow + ;; to the relevant parts. + (when ofunc + (if (eq ofunc 'gnus-request-head) + (narrow-to-region + (point) + (or (search-forward "\n\n" nil t) (point-max))) + (narrow-to-region + (or (search-forward "\n\n" nil t) (point)) + (point-max)))) + (setq scores all-scores) + ;; Find matches. + (while scores + (setq alist (pop scores) + entries (assoc header alist)) + (while (cdr entries) ;First entry is the header index. + (let* ((rest (cdr entries)) + (kill (car rest)) + (match (nth 0 kill)) + (type (or (nth 3 kill) 's)) + (score (or (nth 1 kill) + gnus-score-interactive-default-score)) + (date (nth 2 kill)) + (found nil) + (case-fold-search + (not (or (eq type 'R) (eq type 'S) + (eq type 'Regexp) (eq type 'String)))) + (search-func + (cond ((or (eq type 'r) (eq type 'R) + (eq type 'regexp) (eq type 'Regexp)) + 're-search-forward) + ((or (eq type 's) (eq type 'S) + (eq type 'string) (eq type 'String)) + 'search-forward) + (t + (error "Invalid match type: %s" type))))) + (goto-char (point-min)) + (when (funcall search-func match nil t) + ;; Found a match, update scores. + (setcdr (car articles) (+ score (cdar articles))) + (setq found t) + (when trace + (push + (cons (car-safe (rassq alist gnus-score-cache)) + kill) + gnus-score-trace))) + ;; Update expire date + (unless trace + (cond + ((null date)) ;Permanent entry. + ((and found gnus-update-score-entry-dates) + ;; Match, update date. + (gnus-score-set 'touched '(t) alist) + (setcar (nthcdr 2 kill) now)) + ((and expire (< date expire)) ;Old entry, remove. + (gnus-score-set 'touched '(t) alist) + (setcdr entries (cdr rest)) + (setq rest entries)))) + (setq entries rest)))) + (when handles (mm-destroy-parts handles)))) + (setq articles (cdr articles))))))) + nil)) (defun gnus-score-thread (scores header now expire &optional trace) (gnus-score-followup scores header now expire trace t)) --=-=-=--