Gnus development mailing list
 help / color / mirror / Atom feed
* nndejanews -- http help needed
@ 1996-08-17 18:16 Lars Magne Ingebrigtsen
  1996-08-17 21:07 ` Sudish Joseph
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Lars Magne Ingebrigtsen @ 1996-08-17 18:16 UTC (permalink / raw)


I thought it would be way neat do write this backend, so I have.
Well, I've written the skeleton, but since I know nothing about the
internals of w3, I wonder whether someone could point me in the right
direction.  I need functions for:

1) Fetching an URL and waiting for the result
2) Fetching an URL and returning immediately (must have that async
prefetch, you know)
3) Decoding HTML entities -- >
4) Encoding a string into an URL
5) Doing a POST form thingie

Well, that should tide me over, I think...

Interface-wise, I though this would work like this:  

1) The user types a magic command (`G N') and a search criteria
2) Gnus fetches the "headers" for all the articles that match -- which
means sniffing the search results (and "next") and stuff
3) A normal summary buffer is created
4) The user reads the articles

Simple, eh?

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@ifi.uio.no * Lars Ingebrigtsen


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: nndejanews -- http help needed
  1996-08-17 18:16 nndejanews -- http help needed Lars Magne Ingebrigtsen
@ 1996-08-17 21:07 ` Sudish Joseph
  1996-08-18  7:23   ` Joev Dubach
  1996-08-18  4:03 ` William Perry
  1996-08-18 17:58 ` Steven L Baur
  2 siblings, 1 reply; 14+ messages in thread
From: Sudish Joseph @ 1996-08-17 21:07 UTC (permalink / raw)


Lars Magne Ingebrigtsen <larsi@ifi.uio.no> writes:
> Simple, eh?

Awesome.

-Sudish


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: nndejanews -- http help needed
  1996-08-17 18:16 nndejanews -- http help needed Lars Magne Ingebrigtsen
  1996-08-17 21:07 ` Sudish Joseph
@ 1996-08-18  4:03 ` William Perry
  1996-08-18 12:06   ` Lars Magne Ingebrigtsen
  1996-08-18 17:58 ` Steven L Baur
  2 siblings, 1 reply; 14+ messages in thread
From: William Perry @ 1996-08-18  4:03 UTC (permalink / raw)
  Cc: ding

Lars Magne Ingebrigtsen writes:
>I thought it would be way neat do write this backend, so I have.
>Well, I've written the skeleton, but since I know nothing about the
>internals of w3, I wonder whether someone could point me in the right
>direction.  I need functions for:
>
>1) Fetching an URL and waiting for the result

(url-insert-file-contents "URL")

>2) Fetching an URL and returning immediately (must have that async
>   prefetch, you know) 

  w/this function, 'callback' will be called with 'data' when the URL has
been retrieved.

(defun gnus-url-retrieve-asynch (url callback &rest data)
  (let (
	(url-request-method "GET")
	(old-asynch url-be-asynchronous)
	(url-request-data nil)
	(url-request-extra-headers nil)
	(url-working-buffer (generate-new-buffer-name " *dejanews*")))
    (setq-default url-be-asynchronous t)
    (save-excursion
      (set-buffer (get-buffer-create url-working-buffer))
      (setq url-current-callback-data data
	    url-be-asynchronous t
	    url-current-callback-func callback)
      (url-retrieve src))
    (setq-default url-be-asynchronous old-asynch)))

>3) Decoding HTML entities -- &gt;



>4) Encoding a string into an URL

(url-hexify-string string)

>5) Doing a POST form thingie

  Well, the forms stuff expects widgets to be passed in (Emacs-W3 now uses
Per's widget stuff), so you'll have to fake it out a little bit.  Try:

(defun gnus-encode-www-form-urlencoded (pairs)
  (mapconcat 
    (function
      (lambda (data)
        (concat (w3-form-encode-xwfu (car data)) "="
                (w3-form-encode-xwfu (cdr data))))) pairs "&"))

  If you are using the 'get' method, just tack this onto the end of the URL
with a "?", and you should be good.  For posting, let-bind url-request-data
to what's returned by this, and url-request-method to 'POST', and
url-request-extra-headers to something like
 '(("Content-type" . "application/x-www-form-urlencoded"))

  then fetch the HTTP URL.

>Well, that should tide me over, I think...
>
>Interface-wise, I though this would work like this:  
>
>1) The user types a magic command (`G N') and a search criteria
>2) Gnus fetches the "headers" for all the articles that match -- which
>means sniffing the search results (and "next") and stuff
>3) A normal summary buffer is created
>4) The user reads the articles
>
>Simple, eh?

  Sounds very very useful.  Perhaps the base funcitonality should be in W3,
ie: a function to retrieve a search page, and then return a list of all the
headers to documents contained in the results?

-Bill P.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: nndejanews -- http help needed
  1996-08-17 21:07 ` Sudish Joseph
@ 1996-08-18  7:23   ` Joev Dubach
  1996-08-18 12:09     ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 14+ messages in thread
From: Joev Dubach @ 1996-08-18  7:23 UTC (permalink / raw)
  Cc: ding

Sudish Joseph <sudish@mindspring.com> writes:
> Lars Magne Ingebrigtsen <larsi@ifi.uio.no> writes:
>> Simple, eh?
> Awesome.

Agreed.  Will it also be able to access Usenet archives at AltaVista,
or are those a different enough protocol that it won't be possible?
Also, will one be able to hit ^, and, when the parent article isn't on
the local server, be prompted whether or not to look for it on
DejaNews?  Infinite thread regression!

Joev                            <URL:"http://www.math.harvard.edu/~joev/">

  "'Usenet is not a right.'
   'Usenet is a right, a left, a jab, and a sharp uppercut to the
    jaw. The postman hits!  You have new mail.'"
          -- Ed Vielmetti and Chip Salzenberg


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: nndejanews -- http help needed
  1996-08-18  4:03 ` William Perry
@ 1996-08-18 12:06   ` Lars Magne Ingebrigtsen
  1996-08-18 17:07     ` William Perry
  0 siblings, 1 reply; 14+ messages in thread
From: Lars Magne Ingebrigtsen @ 1996-08-18 12:06 UTC (permalink / raw)


William Perry <wmperry@aventail.com> writes:

[lots of excellent code snipped.]

I just plonked in your code -- and it worked almost at the first
attempt.  Squeel!

nndejagnus (I renamed it to avoid being sued) actually works.  With
article prefetch and all.  I'm flabbergasted.

> >3) Decoding HTML entities -- &gt;
> 
> 
> 

I looked over w3, but I couldn't find any function for doing this --
so I assume it hasn't been separated out into its own function?  It
looked to me like it was handled deep in the bowels of w3-parse
somewhere...  

Anyways, I just used the `w3-html-entities' list and wrote a snippet
to decode entities.

>   Sounds very very useful.  Perhaps the base funcitonality should be in W3,
> ie: a function to retrieve a search page, and then return a list of all the
> headers to documents contained in the results?

But wouldn't w3 have to handle each search engine specially?  You
know, HTML isn't used for presenting information nowadays -- it's used
for making Pretty Pages.  Gleaning any useful information from the
DejaNews output means writing quite arcane regexps...

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@ifi.uio.no * Lars Ingebrigtsen


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: nndejanews -- http help needed
  1996-08-18  7:23   ` Joev Dubach
@ 1996-08-18 12:09     ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 14+ messages in thread
From: Lars Magne Ingebrigtsen @ 1996-08-18 12:09 UTC (permalink / raw)


dubach1@husc.harvard.edu (Joev Dubach) writes:

> Will it also be able to access Usenet archives at AltaVista,
> or are those a different enough protocol that it won't be possible?

I don't think doing AltaVista searches with Gnus will be all that
useful -- most "documents" reached via AltaVisa aren't "newsey", so
it's better to read them with a Web browser.  (w3, for instance).

Of course, searching the Usenet via AltaVista is a possibility, but
doesn't DejaNews store more stuff than AltaVista?  (I don't really
know.) 

> Also, will one be able to hit ^, and, when the parent article isn't
> on the local server, be prompted whether or not to look for it on
> DejaNews?  Infinite thread regression!

Not via DejaNews, no.  They don't offer searches on Message-IDs, as
for as I know.  InReference does, so perhaps I (or someone else) will
write an nninreference backend.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@ifi.uio.no * Lars Ingebrigtsen


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: nndejanews -- http help needed
  1996-08-18 12:06   ` Lars Magne Ingebrigtsen
@ 1996-08-18 17:07     ` William Perry
  1996-08-19  8:40       ` Per Abrahamsen
                         ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: William Perry @ 1996-08-18 17:07 UTC (permalink / raw)
  Cc: ding

Lars Magne Ingebrigtsen writes:
>William Perry <wmperry@aventail.com> writes:
>
>[lots of excellent code snipped.]
>
>I just plonked in your code -- and it worked almost at the first
>attempt.  Squeel!

  Cool.

>nndejagnus (I renamed it to avoid being sued) actually works.  With
>article prefetch and all.  I'm flabbergasted.

  I've been considering officially changing the name of W3 to GNUSCAPE or
NEWSCAPE, but figured I'd get smacked by someone.

>> >3) Decoding HTML entities -- &gt;
>
>I looked over w3, but I couldn't find any function for doing this -- so I
>assume it hasn't been separated out into its own function?  It looked to
>me like it was handled deep in the bowels of w3-parse somewhere...
>
>Anyways, I just used the `w3-html-entities' list and wrote a snippet
>to decode entities.

  That's what you have to do unless you use the parser.  The HTML entity
decoding is an integral part of the parser (has to be for crting types of
entities that can insert new markup as part of their expansion).

>>   Sounds very very useful.  Perhaps the base funcitonality should be in W3,
>> ie: a function to retrieve a search page, and then return a list of all the
>> headers to documents contained in the results?
>
>But wouldn't w3 have to handle each search engine specially?  You know,
>HTML isn't used for presenting information nowadays -- it's used for
>making Pretty Pages.  Gleaning any useful information from the DejaNews
>output means writing quite arcane regexps...

  Tell me about it - Emacs-W3's parser used to be entirely based on
regexps, with all sorts of disgusting things depending upon the order in
which they were called.  Christ - I implemented tables 2 1/2 years ago
(first tables draft ever :) using this method - GACK!

  You might want to take a look at the parse structure of a page at
dejanews or altavista - it might be possible to just recurse down the parse
tree looking for <a> tags with a specific prefix for the HREF attribute...
Hmmmm... 

  for altavista, you might be able to just look for news: links inside
<pre> sections.  Here's a snippet of the parse tree.  Looks pretty similar
for dejanews.

-Bill P.

				 (pre nil
				      ("01.Jul "
				       (b nil
					  ("comp.emacs.xemacs    "))
				       "  "
				       (a
					((href . "mailto:dkarr@nmo.gtegsc.com"))
					("dkarr@nmo.gtegsc"))
				       " "
				       (a
					((href . "news:uyd92gc61i.fsf@cheetos.nmo.gtegsc.com"))
					("L"))
				       " "
				       (a
					((href . "http://ww2.altavista.digital.com/cgi-bin/news?plain@msg@4520@comp%2eemacs%2exemacs"))
					("B"))
				       " "
				       (a
					((href . "http://ww2.altavista.digital.com/cgi-bin/news?msg@4520@comp%2eemacs%2exemacs%26emacs"))
					("Using XEmacs 19.xx with GNU"))
				       "\n03.Jul "


-Bill P.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: nndejanews -- http help needed
  1996-08-17 18:16 nndejanews -- http help needed Lars Magne Ingebrigtsen
  1996-08-17 21:07 ` Sudish Joseph
  1996-08-18  4:03 ` William Perry
@ 1996-08-18 17:58 ` Steven L Baur
  2 siblings, 0 replies; 14+ messages in thread
From: Steven L Baur @ 1996-08-18 17:58 UTC (permalink / raw)


>>>>> "Lars" == Lars Magne Ingebrigtsen <larsi@ifi.uio.no> writes:

Lars> I need functions for:

Lars> 3) Decoding HTML entities -- &gt;

Take a look at iso-sgml.el in XEmacs 19.14 lisp/psgml.  That file
contains code to decode all of the ISO Latin-1 entities.

-- 
steve@miranova.com baur
Unsolicited commercial e-mail will be proofread for $250/hour.
Andrea Seastrand: For your vote on the Telecom bill, I will vote for anyone
except you in November.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: nndejanews -- http help needed
  1996-08-18 17:07     ` William Perry
@ 1996-08-19  8:40       ` Per Abrahamsen
  1996-08-19 18:01       ` Lars Magne Ingebrigtsen
  1996-08-23 14:05       ` Kai Grossjohann
  2 siblings, 0 replies; 14+ messages in thread
From: Per Abrahamsen @ 1996-08-19  8:40 UTC (permalink / raw)



William Perry <wmperry@aventail.com> writes:

>   I've been considering officially changing the name of W3 to GNUSCAPE or
> NEWSCAPE, but figured I'd get smacked by someone.

"Remember:  It is spelled w3 but it is pronounced GNUSCAPE!"


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: nndejanews -- http help needed
  1996-08-18 17:07     ` William Perry
  1996-08-19  8:40       ` Per Abrahamsen
@ 1996-08-19 18:01       ` Lars Magne Ingebrigtsen
  1996-08-20  2:59         ` William Perry
  1996-08-23 14:05       ` Kai Grossjohann
  2 siblings, 1 reply; 14+ messages in thread
From: Lars Magne Ingebrigtsen @ 1996-08-19 18:01 UTC (permalink / raw)


William Perry <wmperry@aventail.com> writes:

>   You might want to take a look at the parse structure of a page at
> dejanews or altavista - it might be possible to just recurse down the parse
> tree looking for <a> tags with a specific prefix for the HREF attribute...

That would probably be much better (not to mention more robust) than
the regexp-based approach I've used.  On the other hand, it's quite
important that this parsing is really fast, since nnweb needs to parse
quite a lot of them.  It seems like it might be overdoing it slightly
to parse things "properly" when all one needs are little snippets that
are inserted here and there...

-- 
  "Yes.  The journey through the human heart 
     would have to wait until some other time."


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: nndejanews -- http help needed
  1996-08-19 18:01       ` Lars Magne Ingebrigtsen
@ 1996-08-20  2:59         ` William Perry
  1996-08-22 15:41           ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 14+ messages in thread
From: William Perry @ 1996-08-20  2:59 UTC (permalink / raw)


Lars Magne Ingebrigtsen writes:
>William Perry <wmperry@aventail.com> writes:
>
>>   You might want to take a look at the parse structure of a page at
>> dejanews or altavista - it might be possible to just recurse down the parse
>> tree looking for <a> tags with a specific prefix for the HREF attribute...
>
>That would probably be much better (not to mention more robust) than the
>regexp-based approach I've used.  On the other hand, it's quite important
>that this parsing is really fast, since nnweb needs to parse quite a lot
>of them.  It seems like it might be overdoing it slightly to parse things
>"properly" when all one needs are little snippets that are inserted here
>and there...

  The parser itself is very fast - its the calls into the drawing engine
that slow it down, and you can turn those off.

-Bill P.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: nndejanews -- http help needed
  1996-08-20  2:59         ` William Perry
@ 1996-08-22 15:41           ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 14+ messages in thread
From: Lars Magne Ingebrigtsen @ 1996-08-22 15:41 UTC (permalink / raw)


William Perry <wmperry@aventail.com> writes:

>   The parser itself is very fast - its the calls into the drawing engine
> that slow it down, and you can turn those off.

Right.  I'll look into it.  It would be very nice not to have to chew
the output "manually" with regexps...

-- 
  "Yes.  The journey through the human heart 
     would have to wait until some other time."


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: nndejanews -- http help needed
  1996-08-18 17:07     ` William Perry
  1996-08-19  8:40       ` Per Abrahamsen
  1996-08-19 18:01       ` Lars Magne Ingebrigtsen
@ 1996-08-23 14:05       ` Kai Grossjohann
  1996-08-23 15:50         ` C. R. Oldham
  2 siblings, 1 reply; 14+ messages in thread
From: Kai Grossjohann @ 1996-08-23 14:05 UTC (permalink / raw)
  Cc: Lars Magne Ingebrigtsen, ding

>>>>> William Perry writes:

  Bill>   I've been considering officially changing the name of W3 to
  Bill> GNUSCAPE or NEWSCAPE, but figured I'd get smacked by someone.

Who wants to have these connotations? :-)

GNUSAIC, I say :-)

kai
-- 
Life is hard and then you die.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: nndejanews -- http help needed
  1996-08-23 14:05       ` Kai Grossjohann
@ 1996-08-23 15:50         ` C. R. Oldham
  0 siblings, 0 replies; 14+ messages in thread
From: C. R. Oldham @ 1996-08-23 15:50 UTC (permalink / raw)


On 23 Aug 1996, Kai Grossjohann wrote:

> Who wants to have these connotations? :-)
> 
> GNUSAIC, I say :-)


Or, we could add yet another homonym to the GNU dictionary and call it

	Gnuw

--
| Charles R. (C. R.) Oldham | North Central Association             |
| cro@nca.asu.edu           | Commission on Schools                 |
| cro@asu.edu               | Arizona State University, Box 873011, |
| Voice: 602/965-8700       | Tempe, AZ 85287-3011                _ |
| Fax:   602/965-9423       | #include "disclaimer.h"            X_>|


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~1996-08-23 15:50 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1996-08-17 18:16 nndejanews -- http help needed Lars Magne Ingebrigtsen
1996-08-17 21:07 ` Sudish Joseph
1996-08-18  7:23   ` Joev Dubach
1996-08-18 12:09     ` Lars Magne Ingebrigtsen
1996-08-18  4:03 ` William Perry
1996-08-18 12:06   ` Lars Magne Ingebrigtsen
1996-08-18 17:07     ` William Perry
1996-08-19  8:40       ` Per Abrahamsen
1996-08-19 18:01       ` Lars Magne Ingebrigtsen
1996-08-20  2:59         ` William Perry
1996-08-22 15:41           ` Lars Magne Ingebrigtsen
1996-08-23 14:05       ` Kai Grossjohann
1996-08-23 15:50         ` C. R. Oldham
1996-08-18 17:58 ` Steven L Baur

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).