From: David Engster <deng@randomsample.de>
To: ding@gnus.org
Subject: Re: Does nnweb with Google work any more?
Date: Wed, 14 Mar 2012 18:06:00 +0100 [thread overview]
Message-ID: <87ipi7w1s7.fsf@randomsample.de> (raw)
In-Reply-To: <m37gynryl9.fsf@stories.gnus.org> (Lars Magne Ingebrigtsen's message of "Wed, 14 Mar 2012 16:28:34 +0100")
Lars Magne Ingebrigtsen writes:
> David Engster <deng@randomsample.de> writes:
>
>> No, they just don't want to be crawled. A simple "-A foobar" will make
>> it work. Also, adding "&output=gplain" will give raw text.
>
> Oh, nice. :-)
>
> curl -A foobar 'http://groups.google.com/group/rec.arts.sf.written/msg/eeb018dcf3c1688e?dmode=source&output=gplain'
>
> works fine.
>
> Then the only question is how to get from the Message-ID to the Google
> ID. Let's see... the first URL had this snippet in the HTML:
>
> Michael Stemper wrote: In article<9rt27vF38...@mid.individual.net>, <b>...</b></span><br><span class="a">http://groups.google.com/g/0897fef7/t/d00e330e9c82797a/d/eeb018dcf3c1688e</span>
>
> Will there only be one of these URLs in the output?
No idea. Maybe it would be safer to snarf the q=#eeb018... anchor from the
title's target:
<div class="g" align=left><a href="http://www.google.com/url?url=http://groups.google.com/g/0897fef7/t/d00e330e9c82797a/d/eeb018dcf3c1688e%3Fq%3D%23eeb018dcf3c1688e&ei=WM1gT-vtOdTw_Aah-IjTDg&sa=t&ct=res&cd=1&source=groups&usg=AFQjCNGv6gXQ4vTjK4dTlhZQfwmCOamYKw"
target="" dir=ltr>
Anyway, it's a pretty boring scavenger hunt.
-David
next prev parent reply other threads:[~2012-03-14 17:06 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-25 0:11 Lars Magne Ingebrigtsen
2012-02-25 16:44 ` Reiner Steib
2012-03-10 0:55 ` Lars Magne Ingebrigtsen
2012-03-10 11:24 ` David Engster
2012-03-10 12:11 ` Lars Magne Ingebrigtsen
2012-03-10 13:23 ` David Engster
2012-03-10 16:07 ` Andreas Schwab
2012-03-10 17:04 ` David Engster
2012-03-14 15:09 ` Lars Magne Ingebrigtsen
2012-03-14 15:24 ` David Engster
2012-03-14 15:28 ` Lars Magne Ingebrigtsen
2012-03-14 17:06 ` David Engster [this message]
2012-03-22 20:40 ` Lars Magne Ingebrigtsen
2012-03-14 15:33 ` James Cloos
2012-03-14 16:40 ` Andreas Schwab
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ipi7w1s7.fsf@randomsample.de \
--to=deng@randomsample.de \
--cc=ding@gnus.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).