From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/43033 Path: main.gmane.org!not-for-mail From: Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai =?iso-8859-1?q?Gro=DFjohann?=) Newsgroups: gmane.emacs.gnus.general Subject: Re: Read Heise newsticker: hackers wanted Date: Mon, 11 Feb 2002 10:14:17 +0100 Sender: owner-ding@hpc.uh.edu Message-ID: References: NNTP-Posting-Host: coloc-standby.netfonds.no Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Trace: main.gmane.org 1035178190 15397 80.91.224.250 (21 Oct 2002 05:29:50 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Mon, 21 Oct 2002 05:29:50 +0000 (UTC) Return-Path: Original-Received: (qmail 22365 invoked from network); 11 Feb 2002 09:16:27 -0000 Original-Received: from malifon.math.uh.edu (mail@129.7.128.13) by mastaler.com with SMTP; 11 Feb 2002 09:16:27 -0000 Original-Received: from sina.hpc.uh.edu ([129.7.128.10] ident=lists) by malifon.math.uh.edu with esmtp (Exim 3.20 #1) id 16aCYG-0007zn-00; Mon, 11 Feb 2002 03:15:08 -0600 Original-Received: by sina.hpc.uh.edu (TLB v0.09a (1.20 tibbs 1996/10/09 22:03:07)); Mon, 11 Feb 2002 03:15:06 -0600 (CST) Original-Received: from sclp3.sclp.com (qmailr@sclp3.sclp.com [209.196.61.66]) by sina.hpc.uh.edu (8.9.3/8.9.3) with SMTP id DAA27797 for ; Mon, 11 Feb 2002 03:14:54 -0600 (CST) Original-Received: (qmail 21157 invoked by alias); 11 Feb 2002 09:14:51 -0000 Original-Received: (qmail 21148 invoked from network); 11 Feb 2002 09:14:50 -0000 Original-Received: from waldorf.cs.uni-dortmund.de (129.217.4.42) by gnus.org with SMTP; 11 Feb 2002 09:14:50 -0000 Original-Received: from lothlorien.cs.uni-dortmund.de (lothlorien [129.217.19.67]) by waldorf.cs.uni-dortmund.de with ESMTP id g1B9ENb05542 for ; Mon, 11 Feb 2002 10:14:23 +0100 (MET) Original-Received: from lucy.cs.uni-dortmund.de (lucy [129.217.19.80]) by lothlorien.cs.uni-dortmund.de id KAA12996; Mon, 11 Feb 2002 10:14:18 +0100 (MET) Original-Received: by lucy.cs.uni-dortmund.de (Postfix, from userid 6104) id D232D3ADC6; Mon, 11 Feb 2002 10:14:17 +0100 (CET) Original-To: ding@gnus.org In-Reply-To: (Kai.Grossjohann@cs.uni-dortmund.de's message of "Sat, 09 Feb 2002 21:16:07 +0100") Original-Lines: 54 User-Agent: Gnus/5.090006 (Oort Gnus v0.06) Emacs/21.2.50 (i686-pc-linux-gnu) Precedence: list X-Majordomo: 1.94.jlt7 Xref: main.gmane.org gmane.emacs.gnus.general:43033 X-Report-Spam: http://spam.gmane.org/gmane.emacs.gnus.general:43033 --=-=-= Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Here is a slightly improved version which does not display "\201=E4" instead of "=E4" and so on. Also, you can use it to read the newsticker of the Tagesschau, a German public TV news show. --=-=-= Content-Type: application/emacs-lisp Content-Disposition: inline (require 'nnrss) (require 'w3) (add-to-list 'nnrss-group-alist '("Tagesschau" "http://www.tagesschau.de/newsticker.rdf" "Nachrichten der Tagesschau.")) (defun kai-nnrss-content-function (entry group article) (let* ((num (nth 0 entry)) (timestamp (nth 1 entry)) (url (nth 2 entry)) (buf (url-retrieve-synchronously url)) (w3-display-same-buffer t) (w3-explicit-coding-system 'iso-8859-1) (w3-delay-image-loads t) parse pre-search post-search delete) (cond ((string-match "[hH]eise" group) (setq pre-search "" post-search "")) ((string-match "[tT]agesschau" group) (setq pre-search "class=\"content\">\n" post-search ""))) (save-excursion (set-buffer buf) (goto-char (point-min)) (delete-region (point) (search-forward pre-search)) (insert "\n" "\n" "\n") (delete-region (search-forward post-search) (point-max)) (insert "\n\n") (setq parse (w3-parse-buffer buf))) (kill-buffer buf) (let ((b (point))) (w3-draw-tree parse) (encode-coding-region b (point-max) 'iso-latin-1)))) (setq nnrss-content-function 'kai-nnrss-content-function) --=-=-= Content-Disposition: inline But it's still quite hair-raising: first of all, regexp searching does not strike me as the right way to find the relevant parts of the page, and secondly, the explicit and unconditional re-encoding of the W3 tree to Latin-1 smells really bad. kai -- ~/.signature is: umop 3p!sdn (Frank Nobis) --=-=-=--