Gnus development mailing list
 help / color / mirror / Atom feed
* "nnweb survey: Please reply if you are using it"
@ 2025-02-11 23:28 Andrew Cohen
  2025-02-12  7:12 ` James Thomas
  2025-02-12  7:36 ` Bjørn Mork
  0 siblings, 2 replies; 6+ messages in thread
From: Andrew Cohen @ 2025-02-11 23:28 UTC (permalink / raw)
  To: ding

Using nnweb?
-- 
Andrew Cohen



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: "nnweb survey: Please reply if you are using it"
  2025-02-11 23:28 "nnweb survey: Please reply if you are using it" Andrew Cohen
@ 2025-02-12  7:12 ` James Thomas
  2025-02-14 11:13   ` Björn Bidar
  2025-02-12  7:36 ` Bjørn Mork
  1 sibling, 1 reply; 6+ messages in thread
From: James Thomas @ 2025-02-12  7:12 UTC (permalink / raw)
  To: Andrew Cohen; +Cc: ding

Andrew Cohen <acohen@ust.hk> writes:

> Using nnweb?

Yes, but not with the defaults (which no longer work due to server-side
changes); I wrote an extension which I use daily:

  https://codeberg.org/quotuva/nnweb-page/src/branch/main/nnweb-page.el

--


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: "nnweb survey: Please reply if you are using it"
  2025-02-11 23:28 "nnweb survey: Please reply if you are using it" Andrew Cohen
  2025-02-12  7:12 ` James Thomas
@ 2025-02-12  7:36 ` Bjørn Mork
  1 sibling, 0 replies; 6+ messages in thread
From: Bjørn Mork @ 2025-02-12  7:36 UTC (permalink / raw)
  To: ding

Looking through my ~/.gnus I found this, which does look like something
which might be useful:

 ; fall back to google if the 'current select method doesn't have this article
 (setq gnus-refer-article-method '(current
 				  (nnweb "google" (nnweb-type google))))


But testing right now it doesn't seem to work:

 Opening nnweb server on google...done
 Contacting host: www.google.com:80
 url-cookie-generate-header-lines: Wrong type argument: url-cookie, [cookie "PREF" "ID=91404ab4876243f7:TM=1173349445:LM=1234614879:S=wleZIuWR8y-I7D9R" "14-Feb-2011 12:34:39.00 GMT" "/" ".google.com" nil]


And I haven't actually noticed/cared, so I'm not going to claim that I
am using it.


Bjørn


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: "nnweb survey: Please reply if you are using it"
  2025-02-12  7:12 ` James Thomas
@ 2025-02-14 11:13   ` Björn Bidar
  2025-02-20  9:12     ` James Thomas
  0 siblings, 1 reply; 6+ messages in thread
From: Björn Bidar @ 2025-02-14 11:13 UTC (permalink / raw)
  To: James Thomas; +Cc: Andrew Cohen, ding

James Thomas <jimjoe@gmx.net> writes:

> Andrew Cohen <acohen@ust.hk> writes:
>
>> Using nnweb?
>
> Yes, but not with the defaults (which no longer work due to server-side
> changes); I wrote an extension which I use daily:
>
>   https://codeberg.org/quotuva/nnweb-page/src/branch/main/nnweb-page.el

That looks useful. My self I was wondering if something like this could
be adapted for mailman archives or the MHonArc archives which GNU uses.

Some glue for public-inboxes in imap or nntp could also help but that is
unrelated to nnweb since they ca use imap or nntp.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: "nnweb survey: Please reply if you are using it"
  2025-02-14 11:13   ` Björn Bidar
@ 2025-02-20  9:12     ` James Thomas
  2025-02-20 11:30       ` Björn Bidar
  0 siblings, 1 reply; 6+ messages in thread
From: James Thomas @ 2025-02-20  9:12 UTC (permalink / raw)
  To: Andrew Cohen; +Cc: ding

[-- Attachment #1: Type: text/plain, Size: 536 bytes --]

Björn Bidar writes:

> James Thomas writes:
>
>> Andrew Cohen writes:
>>
>>> Using nnweb?
>>
>> Yes, but not with the defaults (which no longer work due to server-side
>> changes); I wrote an extension which I use daily:
>>
>>   https://codeberg.org/quotuva/nnweb-page/src/branch/main/nnweb-page.el
>
> That looks useful. My self I was wondering if something like this could
> be adapted for mailman archives or the MHonArc archives which GNU uses.

So here's a patch that changes nnweb defaults for help-gnu-emacs:


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Update nnweb for lists.gnu.org --]
[-- Type: text/x-patch, Size: 6506 bytes --]

diff --git a/lisp/gnus/nnweb.el b/lisp/gnus/nnweb.el
index 57964f93437..02379a44600 100644
--- a/lisp/gnus/nnweb.el
+++ b/lisp/gnus/nnweb.el
@@ -40,20 +40,20 @@
 (defvoo nnweb-directory (nnheader-concat gnus-directory "nnweb/")
   "Where nnweb will save its files.")

-(defvoo nnweb-type 'google
+(defvoo nnweb-type 'emacs-help
   "What search engine type is being used.
-Valid types include `google' and `dejanews'.")
+Valid types include `emacs-help' and `dejanews'.")

 (defvar nnweb-type-definition
-  '((google
-     (id . "https://www.google.com/groups?as_umsgid=%s&hl=en&dmode=source")
-     (result . "https://groups.google.com/group/%s/msg/%s?dmode=source")
-     (article . nnweb-google-wash-article)
-     (reference . identity)
+  '((emacs-help
+     (id . "https://lists.gnu.org/archive/cgi-bin/namazu.cgi?idxname=help-gnu-emacs&sort=score&result=normal&max=20&submit=Search!&query=%%2Bmessage-id%%3A%%3C%s%%3E")
+     (result . "https://lists.gnu.org/archive/html/%s/%s.html")
+     (article . nnweb-gmane-wash-article)
+     (reference . nnweb-google-reference)
      (map . nnweb-google-create-mapping)
      (search . nnweb-google-search)
-     (address . "https://groups.google.com/groups")
-     (base    . "https://groups.google.com")
+     (address . "https://lists.gnu.org/archive/cgi-bin/namazu.cgi")
+     (base    . "https://lists.gnu.org")
      (identifier . nnweb-google-identity))
     ;; FIXME: Make obsolete?
     (dejanews ;; alias of google
@@ -315,7 +315,7 @@ nnweb-google-parse-1
 	(case-fold-search t)
 	(active (cadr (assoc nnweb-group nnweb-group-alist)))
 	Subject Date Newsgroups From
-	map url mid)
+	map url mid link)
     (unless active
       (push (list nnweb-group (setq active (cons 1 0)))
 	    nnweb-group-alist))
@@ -323,39 +323,34 @@ nnweb-google-parse-1
     (goto-char (point-min))
     (while
 	(re-search-forward
-	 "a +href=\"/group/\\([^>\"]+\\)/browse_thread/[^>]+#\\([0-9a-f]+\\)"
+	 "<a +href=\"/archive/html/\\([^/]+\\)/\\([0-9-]+/msg[0-9]+\\)\\.html\">"
 	 nil t)
-      (setq Newsgroups (match-string-no-properties 1)
-	    ;; Note: Starting with Google Groups 2, `mid' is a Google-internal
-	    ;; ID, not a proper Message-ID.
+      (setq link (match-string-no-properties 0)
+            Newsgroups (match-string-no-properties 1)
+            ;; `mid' is not a proper Message-ID.
 	    mid (match-string-no-properties 2)
 	    url (format
 		 (nnweb-definition 'result) Newsgroups mid))
-      (narrow-to-region (search-forward ">" nil t)
+      (narrow-to-region (point)
 			(search-forward "</a>" nil t))
       (mm-url-remove-markup)
       (mm-url-decode-entities)
+      (replace-string "\n" "")
       (setq Subject (buffer-string))
       (goto-char (point-max))
       (widen)
       (narrow-to-region (point)
-			(search-forward "</table" nil t))
+			(search-forward link nil t))

       (mm-url-remove-markup)
       (mm-url-decode-entities)
       (goto-char (point-max))
       (when
 	  (re-search-backward
- 	   "^\\(?:\\(\\w+\\) \\([0-9]+\\)\\|\\S-+\\)\\(?: \\([0-9]\\{4\\}\\)\\)? by ?\\(.*\\)"
+ 	   "^Author: \\(.*\\)\nDate: \\(.*\\)"
 	   nil t)
-	(setq Date (if (match-string 1)
-		       (format "%s %s 00:00:00 %s"
-			       (match-string 1)
-			       (match-string 2)
-			       (or (match-string 3)
-				   (format-time-string "%Y")))
-		     (current-time-string)))
-	(setq From (match-string 4)))
+	(setq Date (match-string-no-properties 2))
+	(setq From (match-string-no-properties 1)))
       (widen)
       (cl-incf i)
       (unless (nnweb-get-hashtb url)
@@ -363,10 +358,8 @@ nnweb-google-parse-1
 	 (list
 	  (cl-incf (cdr active))
 	  (make-full-mail-header
-	   (cdr active) (if Newsgroups
-			    (concat  "(" Newsgroups ") " Subject)
-			  Subject)
-	   From Date (or Message-ID mid)
+	   (cdr active) Subject
+	   From Date (or Message-ID (concat Newsgroups "/" mid))
 	   nil 0 0 url))
 	 map)
 	(nnweb-set-hashtb (cadar map) (car map))))
@@ -384,10 +377,11 @@ nnweb-google-create-mapping
   "Perform the search and create a number-to-url alist."
   (with-current-buffer nnweb-buffer
     (erase-buffer)
-    (nnheader-message 7 "Searching google...")
+    (nnheader-message 7 "Searching...")
     (when (funcall (nnweb-definition 'search) nnweb-search)
-	(let ((more t)
-	      (i 0))
+	(let ((more 0)
+	      (i 0)
+              link)
 	  (while more
 	    (setq nnweb-articles
 		  (nconc nnweb-articles (nnweb-google-parse-1)))
@@ -395,18 +389,19 @@ nnweb-google-create-mapping
 	    (goto-char (point-min))
 	    (cl-incf i 100)
 	    (if (or (not (re-search-forward
-			  "<a [^>]+href=\"\n?\\([^>\" \n\t]+\\)[^<]*<img[^>]+src=[^>]+next"
+                          (format "<a +href=\"\\(/archive/cgi-bin/namazu.cgi?[^>]+&amp;whence=%s\\)\"" (+ more 20))
 			  nil t))
 		    (>= i nnweb-max-hits))
 		(setq more nil)
 	      ;; Yup, there are more articles
-	      (setq more (concat (nnweb-definition 'base) (match-string 1)))
+	      (setq more (+ more 20)
+                    link (mm-url-decode-entities-string (match-string 1)))
 	    (when more
 	      (erase-buffer)
-	      (nnheader-message 7 "Searching google...(%d)" i)
-	      (mm-url-insert more))))
+	      (nnheader-message 7 "Searching...(%d)" i)
+	      (mm-url-insert (concat (nnweb-definition 'base) link)))))
 	  ;; Return the articles in the right order.
-	  (nnheader-message 7 "Searching google...done")
+	  (nnheader-message 7 "Searching...done")
 	  (setq nnweb-articles
 		(sort nnweb-articles #'car-less-than-car))))))

@@ -416,20 +411,18 @@ nnweb-google-search
     (nnweb-definition 'address)
     "?"
     (mm-url-encode-www-form-urlencoded
-     `(("q" . ,search)
-       ("num" . ,(number-to-string
-		  (min 100 nnweb-max-hits)))
-       ("hq" . "")
-       ("hl" . "en")
-       ("lr" . "")
-       ("safe" . "off")
-       ("sites" . "groups")
-       ("filter" . "0")))))
+     `(("idxname" . "help-gnu-emacs")
+       ("sort" . "score")
+       ("result" . "normal")
+       ("max" . ,(number-to-string
+		  (min 20 nnweb-max-hits)))
+       ("submit" . "Search!")
+       ("query" . ,search)))))
   t)

 (defun nnweb-google-identity (url)
   "Return a unique identifier based on URL."
-  (if (string-match "selm=\\([^ &>]+\\)" url)
+  (if (string-match "archive/html/\\(.*\\)\\.html" url)
       (match-string 1 url)
     url))


[-- Attachment #3: Type: text/plain, Size: 123 bytes --]


WDYT? Please test if you can with: G w emacs-help RET unique string RET.

(The article washing part wasn't worked on)

--

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: "nnweb survey: Please reply if you are using it"
  2025-02-20  9:12     ` James Thomas
@ 2025-02-20 11:30       ` Björn Bidar
  0 siblings, 0 replies; 6+ messages in thread
From: Björn Bidar @ 2025-02-20 11:30 UTC (permalink / raw)
  To: James Thomas; +Cc: Andrew Cohen, ding

James Thomas <jimjoe@gmx.net> writes:

> Björn Bidar writes:
>
>> James Thomas writes:
>>
>>> Andrew Cohen writes:
>>>
>>>> Using nnweb?
>>>
>>> Yes, but not with the defaults (which no longer work due to server-side
>>> changes); I wrote an extension which I use daily:
>>>
>>>   https://codeberg.org/quotuva/nnweb-page/src/branch/main/nnweb-page.el
>>
>> That looks useful. My self I was wondering if something like this could
>> be adapted for mailman archives or the MHonArc archives which GNU uses.
>
> So here's a patch that changes nnweb defaults for help-gnu-emacs:

That looks great.

Just one question this keeps this Goolge Groups source or does it?

As much as I do not like Google Groups I would still like that it stays
there as a fallback.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-02-20 11:31 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-02-11 23:28 "nnweb survey: Please reply if you are using it" Andrew Cohen
2025-02-12  7:12 ` James Thomas
2025-02-14 11:13   ` Björn Bidar
2025-02-20  9:12     ` James Thomas
2025-02-20 11:30       ` Björn Bidar
2025-02-12  7:36 ` Bjørn Mork

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).