Gnus development mailing list
 help / color / mirror / Atom feed
* google as a news server, redux
@ 2001-12-29 23:28 Bill White
  2001-12-30  6:12 ` ShengHuo ZHU
  0 siblings, 1 reply; 12+ messages in thread
From: Bill White @ 2001-12-29 23:28 UTC (permalink / raw)


It seems that google now stores the original article information for
usenet postings.

<url:http://groups.google.com/groups?selm=lyiu0s1k73.fsf%40shasta.cs.uiuc.edu>
then click on "Original Format":
<url:http://groups.google.com/groups?selm=lyiu0s1k73.fsf%40shasta.cs.uiuc.edu&output=gplain>

Does this make it possible to use google as at least a read-only news
server?

Cheers -

bw
-- 
Bill White . billw@wolfram.com . http://members.wri.com/billw
"No ma'am, we're musicians."



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: google as a news server, redux
  2001-12-29 23:28 google as a news server, redux Bill White
@ 2001-12-30  6:12 ` ShengHuo ZHU
  2001-12-31  5:21   ` Bill White
  0 siblings, 1 reply; 12+ messages in thread
From: ShengHuo ZHU @ 2001-12-30  6:12 UTC (permalink / raw)


Bill White <billw@wolfram.com> writes:

> It seems that google now stores the original article information for
> usenet postings.
>
> <url:http://groups.google.com/groups?selm=lyiu0s1k73.fsf%40shasta.cs.uiuc.edu>
> then click on "Original Format":
> <url:http://groups.google.com/groups?selm=lyiu0s1k73.fsf%40shasta.cs.uiuc.edu&output=gplain>
>
> Does this make it possible to use google as at least a read-only news
> server?

You might want to try `G w google RET group:gnu.emacs.gnus RET'.  But
because it is difficult to map article numbers to Message-ID, it is
little possible to make them as permanent groups.

ShengHuo



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: google as a news server, redux
  2001-12-30  6:12 ` ShengHuo ZHU
@ 2001-12-31  5:21   ` Bill White
  2002-01-01  7:03     ` ShengHuo ZHU
  0 siblings, 1 reply; 12+ messages in thread
From: Bill White @ 2001-12-31  5:21 UTC (permalink / raw)


On the Feast of the Holy Family, A. D. 2001, at 00:12, ShengHuo ZHU <zsh@cs.rochester.edu> said:

> Bill White <billw@wolfram.com> writes:
>
>> It seems that google now stores the original article information
>> for usenet postings.
>>
>> <url:http://groups.google.com/groups?selm=lyiu0s1k73.fsf%40shasta.cs.uiuc.edu>
>> then click on "Original Format":
>> <url:http://groups.google.com/groups?selm=lyiu0s1k73.fsf%40shasta.cs.uiuc.edu&output=gplain>
>>
>> Does this make it possible to use google as at least a read-only
>> news server?
>
> You might want to try `G w google RET group:gnu.emacs.gnus RET'.
> But because it is difficult to map article numbers to Message-ID, it
> is little possible to make them as permanent groups.

That commands gives an error:

----------------------------------------------------------------------
Debugger entered--Lisp error: (error "Couldn't request group: No matching articles")
  signal(error ("Couldn't request group: No matching articles"))
  error("Couldn't request group: %s" "No matching articles")
  gnus-group-read-ephemeral-group("m3bsgg2ci0.fsf" (nnweb "m3bsgg2ci0.fsf" (nnweb-search "group:gnu.emacs.gnus") (nnweb-type google) (nnweb-ephemeral-p t)) t (#<buffer *Group*> . group))
  gnus-group-make-web-group(nil)
* call-interactively(gnus-group-make-web-group)
----------------------------------------------------------------------

I'm using the cvs version from about 10 minutes ago, with GNU Emacs
21.1.50.1 (i586-pc-linux-gnu, X toolkit) of 2001-12-23 on seton

Cheers -

bw
-- 
Bill White . billw@wolfram.com . http://members.wri.com/billw
"No ma'am, we're musicians."



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: google as a news server, redux
  2001-12-31  5:21   ` Bill White
@ 2002-01-01  7:03     ` ShengHuo ZHU
  2002-01-01 20:30       ` Ami Fischman
  0 siblings, 1 reply; 12+ messages in thread
From: ShengHuo ZHU @ 2002-01-01  7:03 UTC (permalink / raw)


Bill White <billw@wolfram.com> writes:

> On the Feast of the Holy Family, A. D. 2001, at 00:12, ShengHuo ZHU <zsh@cs.rochester.edu> said:
>
>> Bill White <billw@wolfram.com> writes:
>>
>>> It seems that google now stores the original article information
>>> for usenet postings.
>>>
>>> <url:http://groups.google.com/groups?selm=lyiu0s1k73.fsf%40shasta.cs.uiuc.edu>
>>> then click on "Original Format":
>>> <url:http://groups.google.com/groups?selm=lyiu0s1k73.fsf%40shasta.cs.uiuc.edu&output=gplain>
>>>
>>> Does this make it possible to use google as at least a read-only
>>> news server?
>>
>> You might want to try `G w google RET group:gnu.emacs.gnus RET'.
>> But because it is difficult to map article numbers to Message-ID, it
>> is little possible to make them as permanent groups.
>
> That commands gives an error:
>
> ----------------------------------------------------------------------
> Debugger entered--Lisp error: (error "Couldn't request group: No matching articles")
>   signal(error ("Couldn't request group: No matching articles"))
>   error("Couldn't request group: %s" "No matching articles")
>   gnus-group-read-ephemeral-group("m3bsgg2ci0.fsf" (nnweb "m3bsgg2ci0.fsf" (nnweb-search "group:gnu.emacs.gnus") (nnweb-type google) (nnweb-ephemeral-p t)) t (#<buffer *Group*> . group))
>   gnus-group-make-web-group(nil)
> * call-interactively(gnus-group-make-web-group)
> ----------------------------------------------------------------------
>
> I'm using the cvs version from about 10 minutes ago, with GNU Emacs
> 21.1.50.1 (i586-pc-linux-gnu, X toolkit) of 2001-12-23 on seton

It works fine for me. I am not sure whether it is a url package
related problem or not. If you have wget, lynx or curl installed,
maybe you can try (setq mm-url-use-external t).

ShengHuo



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: google as a news server, redux
  2002-01-01  7:03     ` ShengHuo ZHU
@ 2002-01-01 20:30       ` Ami Fischman
  2002-01-02  0:19         ` ShengHuo ZHU
  2002-01-02  0:22         ` Lars Magne Ingebrigtsen
  0 siblings, 2 replies; 12+ messages in thread
From: Ami Fischman @ 2002-01-01 20:30 UTC (permalink / raw)


ShengHuo ZHU <zsh@cs.rochester.edu> writes:

[...]

> It works fine for me. I am not sure whether it is a url package
> related problem or not. If you have wget, lynx or curl installed,
> maybe you can try (setq mm-url-use-external t).

I'm having the same trouble as the OP (exact same error message).  Trapping
the call to wget (I already have mm-url-use-external set to t), I see that
the command line it is passed is:
wget -q -O - 'http://groups.google.com/groups?q=group%3Agnu.emacs.gnus&num=100&hq=&hl=&lr=&safe=off&sites=groups'
(quoting may be off b/c of the way I catch the args)
and google replies with:

<HTML><HEAD><TITLE>Forbidden</TITLE></HEAD>
<BODY><H1>Forbidden</H1>
Your client from the IP address xxx.xxx.xxx.xxx does not have permission to get
URL
/groups?q=group%3Agnu.emacs.gnus&amp;num=100&amp;hq=&amp;hl=&amp;lr=&amp;safe=off&amp;sites=groups
from this server.<BR><BR>Please see Google's Terms of Service posted at
http://www.google.com/terms_of_service.html
<BR><BR>If you believe that you have received this response in error, please
send email to <A
href="mailto:forbidden@google.com">forbidden@google.com</A>.  In your email,
please let us know the IP address from which you are querying (it is shown
above).  This will help us track down the problem.</BODY></HTML>

(linebreaking may be slightly off, this is an X copy/paste).

Google's TOS prohibits "offline searches", though they don't define what
that means.  But SZ's ability to use google probably means that google isn't
blocking this on purpose (?).  Any ideas what's causing this?  Could it be
that google blocks certain IP blocks for some reason?  (I can search google
via a webbrowser just fine from this IP).
-- 
  Ami Fischman
  usenet@fischman.org



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: google as a news server, redux
  2002-01-01 20:30       ` Ami Fischman
@ 2002-01-02  0:19         ` ShengHuo ZHU
  2002-01-02  0:53           ` Ami Fischman
                             ` (2 more replies)
  2002-01-02  0:22         ` Lars Magne Ingebrigtsen
  1 sibling, 3 replies; 12+ messages in thread
From: ShengHuo ZHU @ 2002-01-02  0:19 UTC (permalink / raw)


Ami Fischman <usenet@fischman.org> writes:

[...]

> Google's TOS prohibits "offline searches", though they don't define
> what that means.  But SZ's ability to use google probably means that
> google isn't blocking this on purpose (?).  Any ideas what's causing
> this?  Could it be that google blocks certain IP blocks for some
> reason?  (I can search google via a webbrowser just fine from this
> IP).

Emm, it seems that google checks useragent.  Accidentally and
intentionally, I have 'useragent= Mozilla/4.7 [en] (Linux; I)' in my
.wgetrc.

ShengHuo



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: google as a news server, redux
  2002-01-01 20:30       ` Ami Fischman
  2002-01-02  0:19         ` ShengHuo ZHU
@ 2002-01-02  0:22         ` Lars Magne Ingebrigtsen
  1 sibling, 0 replies; 12+ messages in thread
From: Lars Magne Ingebrigtsen @ 2002-01-02  0:22 UTC (permalink / raw)


Ami Fischman <usenet@fischman.org> writes:

> Google's TOS prohibits "offline searches", though they don't define what
> that means. 

I'm guessing wildly, and I'm guessing that this means that the
"Referer" header has to point to a Google page.  

-- 
(domestic pets only, the antidote for overdose, milk.)
   larsi@gnus.org * Lars Magne Ingebrigtsen



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: google as a news server, redux
  2002-01-02  0:19         ` ShengHuo ZHU
@ 2002-01-02  0:53           ` Ami Fischman
  2002-01-02  1:07           ` Bill White
  2002-01-02 14:51           ` Randal L. Schwartz
  2 siblings, 0 replies; 12+ messages in thread
From: Ami Fischman @ 2002-01-02  0:53 UTC (permalink / raw)


ShengHuo ZHU <zsh@cs.rochester.edu> writes:

> Emm, it seems that google checks useragent.  Accidentally and
> intentionally, I have 'useragent= Mozilla/4.7 [en] (Linux; I)' in my
> .wgetrc.

Indeed, that fixes it for me too.

-- 
  Ami Fischman
  usenet@fischman.org



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: google as a news server, redux
  2002-01-02  0:19         ` ShengHuo ZHU
  2002-01-02  0:53           ` Ami Fischman
@ 2002-01-02  1:07           ` Bill White
  2002-01-02  1:43             ` Ian Jones
  2002-01-02 14:51           ` Randal L. Schwartz
  2 siblings, 1 reply; 12+ messages in thread
From: Bill White @ 2002-01-02  1:07 UTC (permalink / raw)


On Tue Jan 01 2002 at 18:19, ShengHuo ZHU <zsh@cs.rochester.edu> said:

> Ami Fischman <usenet@fischman.org> writes:
>
> [...]
>
>> Google's TOS prohibits "offline searches", though they don't define
>> what that means.  But SZ's ability to use google probably means
>> that google isn't blocking this on purpose (?).  Any ideas what's
>> causing this?  Could it be that google blocks certain IP blocks for
>> some reason?  (I can search google via a webbrowser just fine from
>> this IP).
>
> Emm, it seems that google checks useragent.  Accidentally and
> intentionally, I have 'useragent= Mozilla/4.7 [en] (Linux; I)' in my
> .wgetrc.

BINGO!  Three cheers for ShengHuo ZHU!

Cheers -

bw
-- 
Bill White . billw@wolfram.com . http://members.wri.com/billw
"No ma'am, we're musicians."



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: google as a news server, redux
  2002-01-02  1:07           ` Bill White
@ 2002-01-02  1:43             ` Ian Jones
  2002-01-02  2:23               ` ShengHuo ZHU
  0 siblings, 1 reply; 12+ messages in thread
From: Ian Jones @ 2002-01-02  1:43 UTC (permalink / raw)


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Bill White <billw@wolfram.com> writes:

>>> Google's TOS prohibits "offline searches", though they don't define
>>> what that means.  But SZ's ability to use google probably means
>>> that google isn't blocking this on purpose (?).  Any ideas what's
>>> causing this?  Could it be that google blocks certain IP blocks for
>>> some reason?  (I can search google via a webbrowser just fine from
>>> this IP).
>>
>> Emm, it seems that google checks useragent.  Accidentally and
>> intentionally, I have 'useragent= Mozilla/4.7 [en] (Linux; I)' in my
>> .wgetrc.
>
> BINGO!  Three cheers for ShengHuo ZHU!

Could you help a seemingly dense lurker get "W g" work for google? I
take it there is a variable to set the user-agent string, but I can't
find it.

Or is the suggestion to use an external package like wget the fix?

-----BEGIN PGP SIGNATURE-----
Comment: Keeping the world safe for geeks.

iD8DBQE8MmXEwBVKl/Nci0oRAn3LAKC30lxf233RUXm6rbHojv29X9Yf7gCeMzOH
3E5VFrJOjy1bMZH6sHTf/8w=
=E9iQ
-----END PGP SIGNATURE-----



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: google as a news server, redux
  2002-01-02  1:43             ` Ian Jones
@ 2002-01-02  2:23               ` ShengHuo ZHU
  0 siblings, 0 replies; 12+ messages in thread
From: ShengHuo ZHU @ 2002-01-02  2:23 UTC (permalink / raw)


Ian Jones <ian@dsl081-056-052.sfo1.dsl.speakeasy.net> writes:

[...]

> Could you help a seemingly dense lurker get "W g" work for google? I
> take it there is a variable to set the user-agent string, but I can't
> find it.

(setq url-package-name "Mozilla" url-package-version "4.0") sets the
user-agent string to "Mozilla/4.0".  Maybe the hack should be set by
default for some sites.

> Or is the suggestion to use an external package like wget the fix?

I use wget because external package are faster and more reliable to
pull pages, especially for slashdot pages.

ShengHuo



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: google as a news server, redux
  2002-01-02  0:19         ` ShengHuo ZHU
  2002-01-02  0:53           ` Ami Fischman
  2002-01-02  1:07           ` Bill White
@ 2002-01-02 14:51           ` Randal L. Schwartz
  2 siblings, 0 replies; 12+ messages in thread
From: Randal L. Schwartz @ 2002-01-02 14:51 UTC (permalink / raw)


>>>>> "ZSH" == ShengHuo ZHU <zsh@cs.rochester.edu> writes:

ZSH> Emm, it seems that google checks useragent.  Accidentally and
ZSH> intentionally, I have 'useragent= Mozilla/4.7 [en] (Linux; I)' in my
ZSH> .wgetrc.

"Mozilla/4" suffices.  In our case, the platform is "elisp". :)

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2002-01-02 14:51 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-12-29 23:28 google as a news server, redux Bill White
2001-12-30  6:12 ` ShengHuo ZHU
2001-12-31  5:21   ` Bill White
2002-01-01  7:03     ` ShengHuo ZHU
2002-01-01 20:30       ` Ami Fischman
2002-01-02  0:19         ` ShengHuo ZHU
2002-01-02  0:53           ` Ami Fischman
2002-01-02  1:07           ` Bill White
2002-01-02  1:43             ` Ian Jones
2002-01-02  2:23               ` ShengHuo ZHU
2002-01-02 14:51           ` Randal L. Schwartz
2002-01-02  0:22         ` Lars Magne Ingebrigtsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).