From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/26529 Path: main.gmane.org!not-for-mail From: Eric Marsden Newsgroups: gmane.emacs.gnus.general Subject: Re: Announce: nnwarchive Date: 09 Nov 1999 18:52:20 +0100 Organization: LAAS-CNRS http://www.laas.fr/ Sender: owner-ding@hpc.uh.edu Message-ID: References: <5biu3bd2dh.fsf@giga.cs.rochester.edu> NNTP-Posting-Host: coloc-standby.netfonds.no Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Trace: main.gmane.org 1035163721 19877 80.91.224.250 (21 Oct 2002 01:28:41 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Mon, 21 Oct 2002 01:28:41 +0000 (UTC) Return-Path: Original-Received: from lisa.math.uh.edu (lisa.math.uh.edu [129.7.128.49]) by sclp3.sclp.com (8.8.5/8.8.5) with ESMTP id MAA20064 for ; Tue, 9 Nov 1999 12:52:56 -0500 (EST) Original-Received: from sina.hpc.uh.edu (lists@Sina.HPC.UH.EDU [129.7.3.5]) by lisa.math.uh.edu (8.9.1/8.9.1) with ESMTP id LAB00507; Tue, 9 Nov 1999 11:52:50 -0600 (CST) Original-Received: by sina.hpc.uh.edu (TLB v0.09a (1.20 tibbs 1996/10/09 22:03:07)); Tue, 09 Nov 1999 11:53:06 -0600 (CST) Original-Received: from sclp3.sclp.com (root@sclp3.sclp.com [204.252.123.139]) by sina.hpc.uh.edu (8.9.3/8.9.3) with ESMTP id LAA03239 for ; Tue, 9 Nov 1999 11:52:53 -0600 (CST) Original-Received: from laas.laas.fr (root@laas.laas.fr [140.93.0.15]) by sclp3.sclp.com (8.8.5/8.8.5) with ESMTP id MAA20054 for ; Tue, 9 Nov 1999 12:52:20 -0500 (EST) Original-Received: from dukas.laas.fr (dukas [140.93.21.58]) by laas.laas.fr (8.9.3/8.9.3) with ESMTP id SAA12909 for ; Tue, 9 Nov 1999 18:52:16 +0100 (MET) Original-Received: (from emarsden@localhost) by dukas.laas.fr (8.9.3/8.9.3) id SAA18821; Tue, 9 Nov 1999 18:52:20 +0100 (MET) Original-To: Gnus Mailing List X-Eric-Conspiracy: there is no conspiracy X-Attribution: ecm X-URL: http://www.chez.com/emarsden/ In-Reply-To: Shenghuo ZHU's message of "09 Nov 1999 12:02:50 -0500" Original-Lines: 20 X-Mailer: Gnus v5.7/Emacs 20.4 Precedence: list X-Majordomo: 1.94.jlt7 Xref: main.gmane.org gmane.emacs.gnus.general:26529 X-Report-Spam: http://spam.gmane.org/gmane.emacs.gnus.general:26529 There are now a number of packages which do washing of HTML pages: nnweb.el, nnslashdot.el, nnwarchive.el, and my babel.el and watson.el (which wash various search engines and translation services; see X-URL), I'm probably forgetting some. They all operate basically the same way: download the page, extract the information, return it formatted. They face the same challenges: conveniently and securely providing updates to users relatively often, as web sites undergo facelifts. So I am wondering if there is scope for a generic "wash.el" which would take as an argument an URL (which is dynamically generated in certain cases), a parser function which returns matches, operating on the raw HTML, and provides automatic or semi-automatic update services which connect to some trusted web site where washing-authors can put updates. -- Eric Marsden It's elephants all the way down