From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/62078 Path: news.gmane.org!not-for-mail From: Andreas Seltenreich Newsgroups: gmane.discuss,gmane.emacs.gnus.general Subject: Re: nnweb + Gmane search Date: Fri, 24 Feb 2006 01:49:57 +0100 Message-ID: <874q2p1sy2.fsf@gate450.dyndns.org> References: <877j9lob4v.fsf@gate450.dyndns.org> <87hd8pmtjf.fsf@gate450.dyndns.org> <87u0bqf4pn.fsf@gate450.dyndns.org> <87acdduewd.fsf@gate450.dyndns.org> <87vevqcs10.fsf_-_@gate450.dyndns.org> <871wyd9m04.fsf@gate450.dyndns.org> <87fymquu3x.fsf@gate450.dyndns.org> <877j7xemvm.fsf@gate450.dyndns.org> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Trace: sea.gmane.org 1140742282 11640 80.91.229.2 (24 Feb 2006 00:51:22 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Fri, 24 Feb 2006 00:51:22 +0000 (UTC) Original-X-From: gmane-discuss-admin@hawk.netfonds.no Fri Feb 24 01:51:19 2006 Return-path: Envelope-to: gd-gmane-discuss@m.gmane.org Original-Received: from hawk.netfonds.no ([80.91.224.246]) by ciao.gmane.org with esmtp (Exim 4.43) id 1FCRAn-0001WS-SZ for gd-gmane-discuss@m.gmane.org; Fri, 24 Feb 2006 01:51:07 +0100 Original-Received: from localhost ([127.0.0.1] helo=hawk.netfonds.no) by hawk.netfonds.no with esmtp (Exim 3.35 #1 (Debian)) id 1FCRAk-0004Ld-00; Fri, 24 Feb 2006 01:51:02 +0100 Original-Received: from smtp2.rz.uni-karlsruhe.de ([129.13.185.218]) by hawk.netfonds.no with esmtp (Exim 3.35 #1 (Debian)) id 1FCR9r-0004LU-00 for ; Fri, 24 Feb 2006 01:50:07 +0100 Original-Received: from rzstud1.stud.uni-karlsruhe.de (rzstud1.stud.uni-karlsruhe.de [193.196.41.33]) by smtp2.rz.uni-karlsruhe.de with esmtp (Exim 4.50 #1) id 1FCR9p-0005pH-0O; Fri, 24 Feb 2006 01:50:07 +0100 Original-Received: from uwi7 by rzstud1.stud.uni-karlsruhe.de with local (Exim 3.36 #1) id 1FCR9s-0005AG-00; Fri, 24 Feb 2006 01:50:09 +0100 Original-To: ding@gnus.org, gmane-discuss@hawk.netfonds.no X-PGP-Key: 0x2C006B340F8C8C1B X-Now-Playing: Edge of Sanity / Purgatory Afterglow X-Face: $:F<87a[gD1?#R6S3j21cr1&C&7bd63GHC.tSdskUb}hhwG(ci*=D5kJ<_N+p9q(7-,PnG. Et.Yh (Olly Betts's message of "Thu, 23 Feb 2006 23:04:22 +0000 (UTC)") User-Agent: Gnus/5.110004 (No Gnus v0.4) Emacs/22.0.50 (gnu/linux) Original-Sender: gmane-discuss-admin@hawk.netfonds.no Errors-To: gmane-discuss-admin@hawk.netfonds.no X-BeenThere: gmane-discuss@hawk.netfonds.no X-Mailman-Version: 2.0.11 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: List-Unsubscribe: , List-Archive: Xref: news.gmane.org gmane.discuss:9325 gmane.emacs.gnus.general:62078 Archived-At: --=-=-= Olly Betts writes: > On 2006-02-23, Reiner Steib wrote: >> Please do, unless Olly isn't happy with the current output. We can >> easily adjust the URL in Gnus later if Olly doesn't want to include >> FMT=nov in the default CGI script yet. But it would be nice to have a >> permanent URL. > > I'm happy with the output (apart from the extra newline at the end but > that's a very minor issue). Ok, I've attached a patch. Future changes in the output to anything besides the Xref header should be transparent to the code. E.g., if users wanted a "Newsgroups:" extra header, it should "just work" as long as it is valid nov. regards, andreas 2006-02-24 Andreas Seltenreich * nnweb.el (nnweb-type-definition, nnweb-gmane-create-mapping, nnweb-gmane-wash-article, nnweb-gmane-search): Fix Gmane web groups. Kudos to Olly Betts for providing NOV output on the server side. (nnweb-google-create-mapping): Update regexps and add some progress indication. --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=nnweb.patch Index: nnweb.el =================================================================== RCS file: /usr/local/cvsroot/gnus/lisp/nnweb.el,v retrieving revision 7.15 diff -c -r7.15 nnweb.el *** nnweb.el 13 Feb 2006 13:32:28 -0000 7.15 --- nnweb.el 23 Feb 2006 19:39:02 -0000 *************** *** 27,35 **** ;; Note: You need to have `w3' installed for some functions to work. - ;; FIXME: Due to changes in the HTML output of Gmane, stuff related to Gmane - ;; web groups (`gnus-group-make-web-group') doesn't work anymore. - ;;; Code: (eval-when-compile (require 'cl)) --- 27,32 ---- *************** *** 82,88 **** (reference . identity) (map . nnweb-gmane-create-mapping) (search . nnweb-gmane-search) ! (address . "http://gmane.org/") (identifier . nnweb-gmane-identity))) "Type-definition alist.") --- 79,85 ---- (reference . identity) (map . nnweb-gmane-create-mapping) (search . nnweb-gmane-search) ! (address . "http://search.gmane.org/cgi-bin/omega.cgi") (identifier . nnweb-gmane-identity))) "Type-definition alist.") *************** *** 407,412 **** --- 404,410 ---- (save-excursion (set-buffer nnweb-buffer) (erase-buffer) + (nnheader-message 7 "Searching google...") (when (funcall (nnweb-definition 'search) nnweb-search) (let ((more t) (i 0)) *************** *** 417,431 **** (goto-char (point-min)) (incf i 100) (if (or (not (re-search-forward ! "\"]+\\)\">= i nnweb-max-hits)) (setq more nil) ;; Yup, there are more articles (setq more (concat (nnweb-definition 'base) (match-string 1))) (when more (erase-buffer) (mm-url-insert more)))) ;; Return the articles in the right order. (setq nnweb-articles (sort nnweb-articles 'car-less-than-car)))))) --- 415,432 ---- (goto-char (point-min)) (incf i 100) (if (or (not (re-search-forward ! "\"]+\\)\">= i nnweb-max-hits)) (setq more nil) ;; Yup, there are more articles (setq more (concat (nnweb-definition 'base) (match-string 1))) (when more (erase-buffer) + (nnheader-message 7 "Searching google...(%d)" i) (mm-url-insert more)))) ;; Return the articles in the right order. + (nnheader-message 7 "Searching google...done") (setq nnweb-articles (sort nnweb-articles 'car-less-than-car)))))) *************** *** 458,503 **** "Perform the search and create a number-to-url alist." (save-excursion (set-buffer nnweb-buffer) ! (erase-buffer) ! (when (funcall (nnweb-definition 'search) nnweb-search) ! (let ((more t) ! (case-fold-search t) ! (active (or (cadr (assoc nnweb-group nnweb-group-alist)) ! (cons 1 0))) ! subject group url ! map) ! ;; Remove stuff from the beginning of results ! (goto-char (point-min)) ! (search-forward "Search Results
    " nil t) ! (delete-region (point-min) (point)) (goto-char (point-min)) ! ;; Iterate over the actual hits ! (while (re-search-forward ".*href=\"\\([^\"]+\\)\">\\(.*\\)" nil t) ! (setq url (concat "http://gmane.org/" (match-string 1))) ! (setq subject (match-string 2)) ! (unless (nnweb-get-hashtb url) ! (push ! (list ! (incf (cdr active)) ! (make-full-mail-header ! (cdr active) (concat "(" group ") " subject) nil nil ! nil nil 0 0 url)) ! map) ! (nnweb-set-hashtb (cadar map) (car map)))) ! ;; Return the articles in the right order. ! (setq nnweb-articles ! (sort (nconc nnweb-articles map) 'car-less-than-car)))))) (defun nnweb-gmane-wash-article () (let ((case-fold-search t)) (goto-char (point-min)) ! (search-forward "" nil t) ! (delete-region (point-min) (point)) ! (goto-char (point-min)) ! (while (looking-at "^
  • \\([^ ]+\\).*
  • ") ! (replace-match "\\1\\2" t) ! (forward-line 1)) ! (mm-url-remove-markup))) (defun nnweb-gmane-search (search) (mm-url-insert --- 459,519 ---- "Perform the search and create a number-to-url alist." (save-excursion (set-buffer nnweb-buffer) ! (let ((case-fold-search t) ! (active (or (cadr (assoc nnweb-group nnweb-group-alist)) ! (cons 1 0))) ! map) ! (erase-buffer) ! (nnheader-message 7 "Searching Gmane..." ) ! (when (funcall (nnweb-definition 'search) nnweb-search) (goto-char (point-min)) ! ;; Skip the status line ! (forward-line 1) ! ;; Thanks to Olly Betts we now have NOV lines in our buffer! ! (while (not (eobp)) ! (unless (eolp) ! (let ((header (nnheader-parse-nov))) ! (let ((xref (mail-header-xref header)) ! (from (mail-header-from header)) ! (subject (mail-header-subject header)) ! (rfc2047-encoding-type 'mime)) ! (when (string-match " \\([^:]+\\):\\([0-9]+\\)" xref) ! (mail-header-set-xref ! header ! (format "http://article.gmane.org/%s/%s/raw" ! (match-string 1 xref) ! (match-string 2 xref)))) ! ! ;; Add host part to gmane-encrypted addresses ! (when (string-match "@$" from) ! (mail-header-set-from header ! (concat from "public.gmane.org"))) ! ! (mail-header-set-subject header ! (rfc2047-encode-string subject)) ! ! (unless (nnweb-get-hashtb (mail-header-xref header)) ! (push ! (list ! (incf (cdr active)) ! header) ! map) ! (nnweb-set-hashtb (cadar map) (car map)))))) ! (forward-line 1))) ! (nnheader-message 7 "Searching Gmane...done") ! (setq nnweb-articles ! (sort (nconc nnweb-articles map) 'car-less-than-car))))) (defun nnweb-gmane-wash-article () (let ((case-fold-search t)) (goto-char (point-min)) ! (when (search-forward "" nil t) ! (delete-region (point-min) (point)) ! (goto-char (point-min)) ! (while (looking-at "^
  • \\([^ ]+\\).*
  • ") ! (replace-match "\\1\\2" t) ! (forward-line 1)) ! (mm-url-remove-markup)))) (defun nnweb-gmane-search (search) (mm-url-insert *************** *** 505,514 **** (nnweb-definition 'address) "?" (mm-url-encode-www-form-urlencoded ! `(("query" . ,search))))) (setq buffer-file-name nil) t) - (defun nnweb-gmane-identity (url) "Return a unique identifier based on URL." --- 521,533 ---- (nnweb-definition 'address) "?" (mm-url-encode-www-form-urlencoded ! `(("query" . ,search) ! ("FMT" . "nov") ! ("HITSPERPAGE" . ,(number-to-string nnweb-max-hits)))))) (setq buffer-file-name nil) + (set-buffer-multibyte t) + (mm-decode-coding-region (point-min) (point-max) 'utf-8) t) (defun nnweb-gmane-identity (url) "Return a unique identifier based on URL." --=-=-=--