Gnus development mailing list
 help / color / mirror / Atom feed
* google search -> 403 forbidden
@ 2002-08-21  9:39 Bill White
  2002-08-21 11:28 ` Kai Großjohann
  2002-08-23  3:55 ` Jesper Harder
  0 siblings, 2 replies; 7+ messages in thread
From: Bill White @ 2002-08-21  9:39 UTC (permalink / raw)


When doing a google search in the Group buffer via 'G w', I get an
error saying that the search produced no results.  After some
edebug-defun work, I found that 'G w' was building a url that google
doesn't like.

My search string was "mark shea" (with the quotes), and here's the
resulting message buffer:

,----
| Edebug: nnweb-google-search
| Opening nnweb server on ruoofbwmukd.fsf-ephemeral...done
|  [7 times]
| Result: "http://groups.google.com/groups"
|  [6 times]
| Result: (("q" . "\"mark shea\"") ("num" . "100") ("hq" . "") ("hl" . "") ("lr" . "") ("safe" . "off") ("sites" . "groups"))
|  [2 times]
| Result: "q=%22mark+shea%22&num=100&hq=&hl=&lr=&safe=off&sites=groups"
|  [2 times]
| Result: "http://groups.google.com/groups?q=%22mark+shea%22&num=100&hq=&hl=&lr=&safe=off&sites=groups"
|  [2 times]
| Result: 1 = C-a
| 
| No matching articles
| Entering debugger...
|  [2 times]
| Back to top level.
`----

In my web browser (opera 6.02 for linux) I followed the url generated
by nnweb-google-search and found google's "403 Forbidden" page.  I
then tried some simple-minded futzing with nnweb-google-search to no
avail.  Can someone else make this work?

Odd, disturbing detail: this all works fine from my linux box at home
through my dialup connection, but not on my linux box at work through
our dedicated connection.  I use cvs emacs in both places, checked out
within a day or so of each other, and I use the *same* gnus .el and
.elc files, rsync'd back & forth between the two computers.

Oort Gnus v0.08 (cvs from 9am UTC 21 Aug 2002), GNU Emacs 21.3.50.1
(i586-pc-linux-gnu, X toolkit) of 2002-07-15 on billwlx

Cheers -

bw
-- 
Bill White . billw@wolfram.com . http://members.wri.com/billw
"No ma'am, we're musicians."




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: google search -> 403 forbidden
  2002-08-21  9:39 google search -> 403 forbidden Bill White
@ 2002-08-21 11:28 ` Kai Großjohann
  2002-08-21 11:56   ` Bill White
  2002-08-26 13:16   ` Bill White
  2002-08-23  3:55 ` Jesper Harder
  1 sibling, 2 replies; 7+ messages in thread
From: Kai Großjohann @ 2002-08-21 11:28 UTC (permalink / raw)
  Cc: ding

I think that Google looks at the User-Agent header.  Not sure whether
that's relevant.
kai
-- 
A large number of young women don't trust men with beards.  (BFBS Radio)



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: google search -> 403 forbidden
  2002-08-21 11:28 ` Kai Großjohann
@ 2002-08-21 11:56   ` Bill White
  2002-08-21 14:31     ` Kai Großjohann
  2002-08-26 13:16   ` Bill White
  1 sibling, 1 reply; 7+ messages in thread
From: Bill White @ 2002-08-21 11:56 UTC (permalink / raw)


On Wed Aug 21 2002 at 06:28, Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Großjohann) said:

> I think that Google looks at the User-Agent header.  Not sure whether
> that's relevant.

Per ShengHuo's 30 Jan 02 advice in
<2nvgdjmc4d.fsf@zsh.cs.rochester.edu> I have this in my .gnus:

   (setq url-package-name "Mozilla" url-package-version "4")

Does that frob google's notion of User-Agent?

Cheers -

bw
-- 
Bill White . billw@wolfram.com . http://members.wri.com/billw
"No ma'am, we're musicians."




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: google search -> 403 forbidden
  2002-08-21 11:56   ` Bill White
@ 2002-08-21 14:31     ` Kai Großjohann
  0 siblings, 0 replies; 7+ messages in thread
From: Kai Großjohann @ 2002-08-21 14:31 UTC (permalink / raw)
  Cc: ding

Bill White <billw@wolfram.com> writes:

>    (setq url-package-name "Mozilla" url-package-version "4")
>
> Does that frob google's notion of User-Agent?

I guess so.

kai
-- 
A large number of young women don't trust men with beards.  (BFBS Radio)



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: google search -> 403 forbidden
  2002-08-21  9:39 google search -> 403 forbidden Bill White
  2002-08-21 11:28 ` Kai Großjohann
@ 2002-08-23  3:55 ` Jesper Harder
  1 sibling, 0 replies; 7+ messages in thread
From: Jesper Harder @ 2002-08-23  3:55 UTC (permalink / raw)


Bill White <billw@wolfram.com> writes:

> When doing a google search in the Group buffer via 'G w', I get an
> error saying that the search produced no results.  After some
> edebug-defun work, I found that 'G w' was building a url that google
> doesn't like.
>
> Can someone else make this work?

It works for me.

> Odd, disturbing detail: this all works fine from my linux box at home
> through my dialup connection, but not on my linux box at work through
> our dedicated connection.

Do you use the same version of the url package? (just a guess).




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: google search -> 403 forbidden
  2002-08-21 11:28 ` Kai Großjohann
  2002-08-21 11:56   ` Bill White
@ 2002-08-26 13:16   ` Bill White
  2002-08-27 13:58     ` Bill White
  1 sibling, 1 reply; 7+ messages in thread
From: Bill White @ 2002-08-26 13:16 UTC (permalink / raw)


On Wed Aug 21 2002 at 06:28, Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Großjohann) said:

> I think that Google looks at the User-Agent header.  Not sure
> whether that's relevant.  kai

That was the problem.  On my office linux box, where 'G w' didn't
work, mm-url-use-external's value is t, so `mm-url-program' will be
used to grab urls.  Here at work, `mm-url-program' is wget.

It seems that google doesn't allow wget connections, so I've told it
to advertise itself as Mozilla:

(setq mm-url-predefined-programs
      '((wget "wget" "-U" "Mozilla" "-q" "-O" "-")
	(w3m  "w3m" "-dump_source")
	(lynx "lynx" "-source")
	(curl "curl")))

and all is well.

Cheers -

bw
-- 
Bill White . billw@wolfram.com . http://members.wri.com/billw
"No ma'am, we're musicians."




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: google search -> 403 forbidden
  2002-08-26 13:16   ` Bill White
@ 2002-08-27 13:58     ` Bill White
  0 siblings, 0 replies; 7+ messages in thread
From: Bill White @ 2002-08-27 13:58 UTC (permalink / raw)


On Mon Aug 26 2002 at 08:16, Bill White <billw@wolfram.com> said:

> (setq mm-url-predefined-programs
>       '((wget "wget" "-U" "Mozilla" "-q" "-O" "-")
> 	(w3m  "w3m" "-dump_source")
> 	(lynx "lynx" "-source")
> 	(curl "curl")))

And the general solution is:

,----[ ~/.wgetrc ]
| useragent = "Mozilla/4.0"
`----

Cheers -

bw
-- 
Bill White . billw@wolfram.com . http://members.wri.com/billw
"No ma'am, we're musicians."




^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2002-08-27 13:58 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-08-21  9:39 google search -> 403 forbidden Bill White
2002-08-21 11:28 ` Kai Großjohann
2002-08-21 11:56   ` Bill White
2002-08-21 14:31     ` Kai Großjohann
2002-08-26 13:16   ` Bill White
2002-08-27 13:58     ` Bill White
2002-08-23  3:55 ` Jesper Harder

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).