Gnus development mailing list
 help / color / mirror / Atom feed
From: David Engster <deng@randomsample.de>
To: ding@gnus.org
Subject: Re: Does nnweb with Google work any more?
Date: Wed, 14 Mar 2012 18:06:00 +0100	[thread overview]
Message-ID: <87ipi7w1s7.fsf@randomsample.de> (raw)
In-Reply-To: <m37gynryl9.fsf@stories.gnus.org> (Lars Magne Ingebrigtsen's message of "Wed, 14 Mar 2012 16:28:34 +0100")

Lars Magne Ingebrigtsen writes:
> David Engster <deng@randomsample.de> writes:
>
>> No, they just don't want to be crawled. A simple "-A foobar" will make
>> it work. Also, adding "&output=gplain" will give raw text.
>
> Oh, nice.  :-)
>
> curl -A foobar 'http://groups.google.com/group/rec.arts.sf.written/msg/eeb018dcf3c1688e?dmode=source&output=gplain'
>
> works fine.
>
> Then the only question is how to get from the Message-ID to the Google
> ID.  Let's see...  the first URL had this snippet in the HTML:
>
> Michael Stemper wrote: In article&lt;9rt27vF38...@mid.individual.net&gt;, <b>...</b></span><br><span class="a">http://groups.google.com/g/0897fef7/t/d00e330e9c82797a/d/eeb018dcf3c1688e</span>
>
> Will there only be one of these URLs in the output?

No idea. Maybe it would be safer to snarf the q=#eeb018... anchor from the
title's target:

<div class="g" align=left><a href="http://www.google.com/url?url=http://groups.google.com/g/0897fef7/t/d00e330e9c82797a/d/eeb018dcf3c1688e%3Fq%3D%23eeb018dcf3c1688e&amp;ei=WM1gT-vtOdTw_Aah-IjTDg&amp;sa=t&amp;ct=res&amp;cd=1&amp;source=groups&amp;usg=AFQjCNGv6gXQ4vTjK4dTlhZQfwmCOamYKw"
target="" dir=ltr>

Anyway, it's a pretty boring scavenger hunt.

-David



  reply	other threads:[~2012-03-14 17:06 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-25  0:11 Lars Magne Ingebrigtsen
2012-02-25 16:44 ` Reiner Steib
2012-03-10  0:55   ` Lars Magne Ingebrigtsen
2012-03-10 11:24     ` David Engster
2012-03-10 12:11       ` Lars Magne Ingebrigtsen
2012-03-10 13:23         ` David Engster
2012-03-10 16:07           ` Andreas Schwab
2012-03-10 17:04             ` David Engster
2012-03-14 15:09             ` Lars Magne Ingebrigtsen
2012-03-14 15:24               ` David Engster
2012-03-14 15:28                 ` Lars Magne Ingebrigtsen
2012-03-14 17:06                   ` David Engster [this message]
2012-03-22 20:40                     ` Lars Magne Ingebrigtsen
2012-03-14 15:33               ` James Cloos
2012-03-14 16:40               ` Andreas Schwab

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ipi7w1s7.fsf@randomsample.de \
    --to=deng@randomsample.de \
    --cc=ding@gnus.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).