Gnus development mailing list
 help / color / mirror / Atom feed
* gnus-split-references tweak
@ 2002-02-06 18:55 Paul Jarc
  2002-02-06 20:50 ` Paul Jarc
  0 siblings, 1 reply; 7+ messages in thread
From: Paul Jarc @ 2002-02-06 18:55 UTC (permalink / raw)


I've found Message-IDs generated by Lotus Notes that contain a space
just before the ">".  (Yep, they're malformed.)  Gnus dutifully
preserves them in its References, but then it's unable to recognize
them for threading.  This patch maks them recognizable: everything
from one "<" to the next "<" is included in a reference except
trailing whitespace.  Any objections, or should I commit?  Might this
confuse some parts of Gnus, when references suddenly start containing
internal whitespace?

--- lisp/gnus-util.el   2002/01/27 11:16:57     6.46
+++ lisp/gnus-util.el   2002/02/06 18:59:13
@@ -470,5 +470,5 @@
   (let ((beg 0)
        ids)
-    (while (string-match "<[^> \t]+>" references beg)
+    (while (string-match "<[^<]+[^< \t]" references beg)
       (push (substring references (match-beginning 0) (setq beg (match-end 0)))
            ids))


paul



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: gnus-split-references tweak
  2002-02-06 18:55 gnus-split-references tweak Paul Jarc
@ 2002-02-06 20:50 ` Paul Jarc
  2002-02-06 20:50   ` Jesper Harder
  0 siblings, 1 reply; 7+ messages in thread
From: Paul Jarc @ 2002-02-06 20:50 UTC (permalink / raw)


I wrote:
> I've found Message-IDs generated by Lotus Notes that contain a space
> just before the ">".  (Yep, they're malformed.)  Gnus dutifully
> preserves them in its References, but then it's unable to recognize
> them for threading.  This patch maks them recognizable: everything
> from one "<" to the next "<" is included in a reference except
> trailing whitespace.

Hm.  It seems this actually isn't sufficient for recognizing them
anyway.  I'm not sure what else would need to be changed, though.


paul



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: gnus-split-references tweak
  2002-02-06 20:50 ` Paul Jarc
@ 2002-02-06 20:50   ` Jesper Harder
  2002-02-06 21:32     ` Paul Jarc
  0 siblings, 1 reply; 7+ messages in thread
From: Jesper Harder @ 2002-02-06 20:50 UTC (permalink / raw)


prj@po.cwru.edu (Paul Jarc) writes:

> I wrote:
>> I've found Message-IDs generated by Lotus Notes that contain a space
>> just before the ">".  (Yep, they're malformed.)  Gnus dutifully
>> preserves them in its References, but then it's unable to recognize
>> them for threading.  This patch maks them recognizable: everything
>> from one "<" to the next "<" is included in a reference except
>> trailing whitespace.
>
> Hm.  It seems this actually isn't sufficient for recognizing them
> anyway.  I'm not sure what else would need to be changed, though.

Maybe you also have to change the regexp in `gnus-parent-id':

      (when (string-match "<[^> \t]+>\\'" references)





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: gnus-split-references tweak
  2002-02-06 20:50   ` Jesper Harder
@ 2002-02-06 21:32     ` Paul Jarc
  2002-02-07  1:59       ` Jesper Harder
  0 siblings, 1 reply; 7+ messages in thread
From: Paul Jarc @ 2002-02-06 21:32 UTC (permalink / raw)


Jesper Harder <harder@ifa.au.dk> wrote:
> prj@po.cwru.edu (Paul Jarc) writes:
>> Hm.  It seems this actually isn't sufficient for recognizing them
>> anyway.  I'm not sure what else would need to be changed, though.
>
> Maybe you also have to change the regexp in `gnus-parent-id':
>
>       (when (string-match "<[^> \t]+>\\'" references)

Oops.  Yep, that works.  I'll commit it tomorrow if no one objects.
-      (when (string-match "<[^> \t]+>\\'" references)
-       (match-string 0 references)))))
+      (when (string-match "\(<[^<]+[^< \t]\)[ \t]*\\'" references)
+       (match-string 1 references)))))


paul



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: gnus-split-references tweak
  2002-02-06 21:32     ` Paul Jarc
@ 2002-02-07  1:59       ` Jesper Harder
  2002-02-07 16:30         ` Paul Jarc
  2002-02-07 20:06         ` Paul Jarc
  0 siblings, 2 replies; 7+ messages in thread
From: Jesper Harder @ 2002-02-07  1:59 UTC (permalink / raw)


prj@po.cwru.edu (Paul Jarc) writes:

> Jesper Harder <harder@ifa.au.dk> wrote:
>> prj@po.cwru.edu (Paul Jarc) writes:
>>> Hm.  It seems this actually isn't sufficient for recognizing them
>>> anyway.  I'm not sure what else would need to be changed, though.
>>
>> Maybe you also have to change the regexp in `gnus-parent-id':
>>
>>       (when (string-match "<[^> \t]+>\\'" references)
>
> Oops.  Yep, that works.  I'll commit it tomorrow if no one objects.
> -      (when (string-match "<[^> \t]+>\\'" references)
> -       (match-string 0 references)))))
> +      (when (string-match "\(<[^<]+[^< \t]\)[ \t]*\\'" references)
> +       (match-string 1 references)))))

You missed a couple of backslashes in the regexp, it should be

  "\\(<[^<]+[^< \t]\\)[ \t]*\\'"

But this regexp is rather expensive compared to the old one
(`gnus-parent-id' is called several times on each article).  Entering a
10k group was 15% slower for me.

I think this one:

      (when (string-match "\\(<[^>]+>\\)[ \t]*\\'" references)
	(match-string 1 references)))))

is more efficient -- the performance penalty was only about 2%.





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: gnus-split-references tweak
  2002-02-07  1:59       ` Jesper Harder
@ 2002-02-07 16:30         ` Paul Jarc
  2002-02-07 20:06         ` Paul Jarc
  1 sibling, 0 replies; 7+ messages in thread
From: Paul Jarc @ 2002-02-07 16:30 UTC (permalink / raw)


Jesper Harder <harder@ifa.au.dk> wrote:
> prj@po.cwru.edu (Paul Jarc) writes:
>> -      (when (string-match "<[^> \t]+>\\'" references)
>> -       (match-string 0 references)))))
>> +      (when (string-match "\(<[^<]+[^< \t]\)[ \t]*\\'" references)
>> +       (match-string 1 references)))))
>
> You missed a couple of backslashes in the regexp, it should be
>
>   "\\(<[^<]+[^< \t]\\)[ \t]*\\'"

Not my day.  And yet somehow it seems to work anyway.  Weird.

> I think this one:
>
>       (when (string-match "\\(<[^>]+>\\)[ \t]*\\'" references)
> 	(match-string 1 references)))))
>
> is more efficient -- the performance penalty was only about 2%.

Yeah, that should be ok.  I've sometimes seen References fields
truncated in the missle of a Message-ID, and this regexp won't find
the last, partial one, but that's ok because the partial one wouldn't
have matched any message's Message-ID anyway.


paul



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: gnus-split-references tweak
  2002-02-07  1:59       ` Jesper Harder
  2002-02-07 16:30         ` Paul Jarc
@ 2002-02-07 20:06         ` Paul Jarc
  1 sibling, 0 replies; 7+ messages in thread
From: Paul Jarc @ 2002-02-07 20:06 UTC (permalink / raw)


Jesper Harder <harder@ifa.au.dk> wrote:
> I think this one:
>
>       (when (string-match "\\(<[^>]+>\\)[ \t]*\\'" references)
> 	(match-string 1 references)))))
>
> is more efficient -- the performance penalty was only about 2%.

I've committed this version, which seems comparably fast to the
original:
      (when (string-match "<[^<]+\\'" references)
The original didn't allow trailing whitespace either, so I don't feel
bad about not allowing it in the new version.


paul



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2002-02-07 20:06 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-02-06 18:55 gnus-split-references tweak Paul Jarc
2002-02-06 20:50 ` Paul Jarc
2002-02-06 20:50   ` Jesper Harder
2002-02-06 21:32     ` Paul Jarc
2002-02-07  1:59       ` Jesper Harder
2002-02-07 16:30         ` Paul Jarc
2002-02-07 20:06         ` Paul Jarc

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).