Slow operations on buffers of tens of megabytes

Gnus development mailing list
 help / color / mirror / Atom feed

* Slow operations on buffers of tens of megabytes
@ 2006-11-05  5:37 Alexandre Oliva
  2006-11-06  5:02 ` Richard Stallman
  0 siblings, 1 reply; 20+ messages in thread
From: Alexandre Oliva @ 2006-11-05  5:37 UTC (permalink / raw)



Please describe exactly what actions triggered the bug
and the precise symptoms of the bug:

I use gnus to read my e-mail, and most of my messages are in nnfolder
groups.  This means lots and lots of messages are kept in a single
mbox-like file.  A few commonly-used groups have their e-mail stored
in files/buffers with a few tens of megabytes.

When I enter such a group, gnus goes over all unread or marked
messages searching for regular expressions to score them, i.e.,
determine whether the messages should be highlighted, hidden,
discarded, etc.

XEmacs 21.5.27 with xemacs-sumo 20060510 (including gnus 5.10) enters
such buffers very quickly, and only gets really slow for buffers that
exceed hundreds of megabytes, with tens of thounsands of e-mails.

In GNU Emacs 22.0.90.1 (x86_64-unknown-linux-gnu, X toolkit, Xaw3d scroll bars)
 of 2006-11-02 on free
X server distributor `The X.Org Foundation', version 11.0.70101000
configured using `configure '--prefix=/home/aoliva/test/emacs-22.0.90' '--exec-prefix=/home/aoliva/test/emacs-22.0.90/H-x86_64-linux-gnu' 'CC=ccache gcc -fno-working-directory -m64''

using the built-in gnus 5.11, it takes minutes to enter groups with
just a few hundred messages and just a few tens of megabytes in the
underlying buffer/file, where XEmacs takes less than 10 seconds.

Scoring of the messages closer to the beginning of the buffer is fast,
but as we move to higher-numbered messages, that are closer to the end
of such big files/buffers, gnus will only score 2-3 messages per
minute, and that's what kills performance.

I can't tell whether it is general big-buffer management that is
causing such slow downs, or if it's regular expression searching, or
some such, but this slow down is severely impacting my ability to
switch back to GNU Emacs :-(

I've historically switched back and forth as new major releases came
up, even though I felt more at home at GNU Emacs.  However, last time
I used GNU Emacs for mail reading, I still used the nnml back end,
instead of nnfolder, and nnml keeps each message in a separate file,
so it was not affected by this problem.

However, since that was much slower to enter groups on both Emacsen,
because folders were fragmented, at some point in the last few months
I switched to nnfolder, and that also had a great impact on my mail
backup times :-)

I hope this is enough information to figure out whom might help me
understand what the problem is and how I might be able to overcome it,
perhaps even helping fix it in GNU Emacs.

Thanks in advance,


Important settings:
  value of $LC_ALL: nil
  value of $LC_COLLATE: nil
  value of $LC_CTYPE: nil
  value of $LC_MESSAGES: nil
  value of $LC_MONETARY: nil
  value of $LC_NUMERIC: nil
  value of $LC_TIME: nil
  value of $LANG: en_US.UTF-8
  locale-coding-system: utf-8
  default-enable-multibyte-characters: t

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Slow operations on buffers of tens of megabytes
  2006-11-05  5:37 Slow operations on buffers of tens of megabytes Alexandre Oliva
@ 2006-11-06  5:02 ` Richard Stallman
  2006-11-06  6:02   ` Katsumi Yamaoka
  0 siblings, 1 reply; 20+ messages in thread
From: Richard Stallman @ 2006-11-06  5:02 UTC (permalink / raw)
  Cc: emacs-pretest-bug, ding

    Scoring of the messages closer to the beginning of the buffer is fast,
    but as we move to higher-numbered messages, that are closer to the end
    of such big files/buffers, gnus will only score 2-3 messages per
    minute, and that's what kills performance.

Does Gnus make lots of overlays?  If so, maybe it needs to call
overlay-recenter from time to time.  Could someone please try that
and see if it makes things fast?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Slow operations on buffers of tens of megabytes
  2006-11-06  5:02 ` Richard Stallman
@ 2006-11-06  6:02   ` Katsumi Yamaoka
  2006-11-06  9:21     ` Reiner Steib
  0 siblings, 1 reply; 20+ messages in thread
From: Katsumi Yamaoka @ 2006-11-06  6:02 UTC (permalink / raw)
  Cc: emacs-pretest-bug, Alexandre Oliva, ding

>>>>> In <E1GgwcW-0000OJ-Pp@fencepost.gnu.org> Richard Stallman wrote:

>     Scoring of the messages closer to the beginning of the buffer is fast,
>     but as we move to higher-numbered messages, that are closer to the end
>     of such big files/buffers, gnus will only score 2-3 messages per
>     minute, and that's what kills performance.

> Does Gnus make lots of overlays?  If so, maybe it needs to call
> overlay-recenter from time to time.  Could someone please try that
> and see if it makes things fast?

AFAIK, Gnus uses text properties here and there, but uses
overlays not so much.  The following one makes Gnus turn off
almost overlays:

(setq gnus-article-button-face nil
      gnus-signature-face nil
      gnus-summary-selected-face nil
      gnus-treat-highlight-citation nil
      gnus-treat-emphasize nil)

If it makes Gnus fast, improving the performance will be worth
trying.  However, I didn't feel any difference, though it might
be because I don't have huge mail folders.

Regards,

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Slow operations on buffers of tens of megabytes
  2006-11-06  6:02   ` Katsumi Yamaoka
@ 2006-11-06  9:21     ` Reiner Steib
  2006-11-06 20:00       ` Alexandre Oliva
  2006-11-12  5:14       ` Richard Stallman
  0 siblings, 2 replies; 20+ messages in thread
From: Reiner Steib @ 2006-11-06  9:21 UTC (permalink / raw)
  Cc: emacs-pretest-bug, Alexandre Oliva, ding

On Mon, Nov 06 2006, Katsumi Yamaoka wrote:

>>>>>> In <E1GgwcW-0000OJ-Pp@fencepost.gnu.org> Richard Stallman wrote:
>
>>     Scoring of the messages closer to the beginning of the buffer is fast,
>>     but as we move to higher-numbered messages, that are closer to the end
>>     of such big files/buffers, gnus will only score 2-3 messages per
>>     minute, and that's what kills performance.
[...]
> (setq gnus-article-button-face nil
>       gnus-signature-face nil
>       gnus-summary-selected-face nil
>       gnus-treat-highlight-citation nil
>       gnus-treat-emphasize nil)
>
> If it makes Gnus fast, improving the performance will be worth
> trying.  However, I didn't feel any difference, though it might
> be because I don't have huge mail folders.

I don't think this matches the problem description.  When scanning big
mbox files, article display isn't involved.  Or am I missing
something?

My guess is that it's problem with case-fold-search when searching for
"X-Gnus-Article-Number" in mbox files in Emacs 22 as analyzed by Elias
Oltmanns back in June:

,----[ http://thread.gmane.org/gmane.emacs.devel/53901/focus=54013 ]
| From: Elias Oltmanns <oltmanns <at> uni-bonn.de>
| Subject: Re: New buffer-case-table makes search_buffer painfully slow
| Newsgroups: gmane.emacs.devel
| Date: 2006-05-06 19:10:08 GMT
| 
| Elias Oltmanns <oltmanns <at> uni-bonn.de> wrote:
| > Hi all,
| >
| > switching from emacs 21 to emacs 22 has a very significant performance
| > impact on packages that make heavy use of search_buffer. An example
| > that actually made me aware of this problem is gnus processing large
| > mbox files. Further analysis of this problem revealed that in emacs 22
| > an "i" in the search string makes search_buffer use simple_search()
| > instead of boyer_moore(). 
| 
| Emacs 22's EQUIVALENCES table relates i, and thus I as well, to two
| more characters with character codes 331857 and 331856. On
| www.unicode.org the character look up engine couldn't find a match for
| U+51051 or U+51050 saying that most likely those codes weren't
| assigned to any characters yet.
| 
| So, here is a plain question: Is there a bug in the case-table in
| emacs 22 or does the search engine on www.unicode.org for some reason
| miss certain character ranges? Slightly biassed, I'm disregarding the
| possibility of me being unable to use www.unicode.org properly, which,
| in fact, might well be the reason for my confusion.
| 
| Second question: If the case-table was right, what would be the right
| way to tacle the problem described in my original post? For me the
| following snippet in .emacs solves the problem:
| --- ~/.emacs ---
| (unless (< emacs-major-version 22)
|   (set-case-syntax 331856 "w" (standard-case-table))
|   (set-case-syntax 331857 "w" (standard-case-table)))
| --- ~/.emacs ---
| 
| This, of course, is a durty hack and I'm wondering whether emacs
| should provide a feature to "clean up" the EQUIVALENCES table in the
| ascii range in order to avoid falling back to a slow search
| algorithm when we are searching for pure ascii strings. Or do you
| think that packages like gnus which make heavy use of
| re-search-forward should handle these performance issues
| themselves---or indeed the users.
`----

Alexandre, could you please try if the hack suggested by Elias makes
your problem go away?

Richard proposed a fix for this, but AFAICS, this has not been
implemented:

,----[ http://thread.gmane.org/gmane.emacs.devel/53901/focus=54025 ]
| From: Richard Stallman <rms <at> gnu.org>
| Subject: Re: New buffer-case-table makes search_buffer painfully slow
| Newsgroups: gmane.emacs.devel
| Date: 2006-05-07 05:01:27 GMT
|
| I think this has to do with the special characters for Turkish,
| lower-case i without dot and upper-case I with dot.  In Turkish,
| upcasing and downcasing preserve the dot, or the absence of the dot.
| 
| I think these lines in characters.el are the cause of the problem.
| 
|   (set-downcase-syntax  ?? ?i tbl)
|   (set-upcase-syntax    ?I ?? tbl)
| 
| They set up only half of what Turkish needs.
| They make dotless-i upcase into I, and they make
| I-with-dot downcase into i.  They can't do vice versa
| because that would break things for other languages.
| So they are not really useful.  We could simply delete them.
| 
| We could also add a minor mode to set up the case table all the way
| for Turkish.
| 
| Would someone like to do that?
`----

Looking at the ChangeLog, it seems that the relevant code in
`characters.el' ...

,----[ international/characters.el ]
| ;; In some languages, U+0049 LATIN CAPITAL LETTER I and U+0131 LATIN
| ;; SMALL LETTER DOTLESS I make a case pair, and so do U+0130 LATIN
| ;; CAPITAL LETTER I WITH DOT ABOVE and U+0069 LATIN SMALL LETTER I.
| ;; Thus we have to check language-environment to handle casing
| ;; correctly.  Currently only I<->i is available.
| [...] 
|   (set-downcase-syntax  ?İ ?i tbl)
|   (set-upcase-syntax    ?I ?ı tbl)
`----

... has been changed back and forth several times:

,----[ ChangeLog ]
| 2005-04-01  Kenichi Handa  <handa@m17n.org>
| 
| 	* international/characters.el: Enable the correct case setting for
| 	dotless-i and dotted-I.
| 
| 2005-02-02  Kenichi Handa  <handa@m17n.org>
| 
| 	* international/characters.el: Cancel previous change for
| 	I-WITH-DOT-ABOVE and DOTLESS-i.
| 
| 2005-02-02  Kenichi Handa  <handa@m17n.org>
| 
| 	* international/latin-5.el (tbl): Setup cases of I-WITH-DOT-ABOVE,
| 	DOTLESS-i.
| 
| 	* international/characters.el: Setup cases of GREEK-FINAL-SIGMA,
| 	Y-WITH-DIAERESIS, I-WITH-DOT-ABOVE, DOTLESS-i.
`----

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Slow operations on buffers of tens of megabytes
  2006-11-06  9:21     ` Reiner Steib
@ 2006-11-06 20:00       ` Alexandre Oliva
  2006-11-07 14:13         ` Reiner Steib
  2006-11-12  5:14       ` Richard Stallman
  1 sibling, 1 reply; 20+ messages in thread
From: Alexandre Oliva @ 2006-11-06 20:00 UTC (permalink / raw)
  Cc: emacs-pretest-bug, ding

On Nov  6, 2006, Reiner Steib <reinersteib+gmane@imap.cc> wrote:

> My guess is that it's problem with case-fold-search when searching for
> "X-Gnus-Article-Number" in mbox files in Emacs 22 as analyzed by Elias
> Oltmanns back in June:

Yep, that's it!

> | --- ~/.emacs ---
> | (unless (< emacs-major-version 22)
> |   (set-case-syntax 331856 "w" (standard-case-table))
> |   (set-case-syntax 331857 "w" (standard-case-table)))
> | --- ~/.emacs ---

This makes gnus blazingly fast again.

> | We could also add a minor mode to set up the case table all the way
> | for Turkish.
> | 
> | Would someone like to do that?

I can try to take a stab at it, but not being an Emacs hacker I just
barely understand the relationship between the reported bug and the
ultimate cause reported in the e-mail, nevermind the proposed work
around, that is indistinguishable from magic to me ;-)

I guess this means I may need some hand-holding, and at this point I'm
not sure that would be more work than actually making the changes.
Please advise.

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Slow operations on buffers of tens of megabytes
  2006-11-06 20:00       ` Alexandre Oliva
@ 2006-11-07 14:13         ` Reiner Steib
  2006-11-08 14:43           ` Reiner Steib
  0 siblings, 1 reply; 20+ messages in thread
From: Reiner Steib @ 2006-11-07 14:13 UTC (permalink / raw)
  Cc: emacs-pretest-bug, Elias Oltmanns, ding

[ Cc-ing Elias Oltmanns; See
  <http://thread.gmane.org/or4ptezc71.fsf%40fsfla.org> or
  <http://thread.gmane.org/gmane.emacs.gnus.general/63925/focus=63929>
  for the full thread. ]

On Mon, Nov 06 2006, Alexandre Oliva wrote:

> On Nov  6, 2006, Reiner Steib <reinersteib+gmane@imap.cc> wrote:
>
>> My guess is that it's problem with case-fold-search when searching for
>> "X-Gnus-Article-Number" in mbox files in Emacs 22 as analyzed by Elias
>> Oltmanns back in June:
>
> Yep, that's it!
>
>> | --- ~/.emacs ---
>> | (unless (< emacs-major-version 22)
>> |   (set-case-syntax 331856 "w" (standard-case-table))
>> |   (set-case-syntax 331857 "w" (standard-case-table)))
>> | --- ~/.emacs ---
>
> This makes gnus blazingly fast again.
>
>> | We could also add a minor mode to set up the case table all the way
>> | for Turkish.
>> | 
>> | Would someone like to do that?
>
> I can try to take a stab at it, but not being an Emacs hacker I just
> barely understand the relationship between the reported bug and the
> ultimate cause reported in the e-mail, nevermind the proposed work
> around, that is indistinguishable from magic to me ;-)
>
> I guess this means I may need some hand-holding, and at this point I'm
> not sure that would be more work than actually making the changes.
> Please advise.

If the problem can't be solved in Emacs, we could maybe change
`nnheader-find-file-noselect' to change the case table for the mbox
files.  The current code reads:

--8<---------------cut here---------------start------------->8---
(defun nnheader-find-file-noselect (&rest args)
  "Open a file with some variables bound.
See `find-file-noselect' for the arguments."
  (let* ((format-alist nil)
	 (auto-mode-alist (mm-auto-mode-alist))
	 (default-major-mode 'fundamental-mode)
	 (enable-local-variables nil)
	 (after-insert-file-functions nil)
	 (enable-local-eval nil)
	 (coding-system-for-read nnheader-file-coding-system)
	 (version-control 'never)
	 (ffh (if (boundp 'find-file-hook)
		  'find-file-hook
		'find-file-hooks))
	 (val (symbol-value ffh)))
    (set ffh nil)
    (unwind-protect
	(apply 'find-file-noselect args)
      (set ffh val))))
--8<---------------cut here---------------end--------------->8---

I expect that (apply 'find-file-noselect args) could be changed to:

--8<---------------cut here---------------start------------->8---
	(with-current-buffer (apply 'find-file-noselect args)
	  (unless (or (featurep 'xemacs)
		      ;; Better check?
		      (< emacs-major-version 22))
	    ;; Apply ASCII-only case-table. Don't modify the
	    ;; standard-case-table.
	    (SOME-CASE-TABLE-CODE)))
--8<---------------cut here---------------end--------------->8---

I don't know much about case tables in Emacs (and I don't have time to
dig deeper into the Lisp Manual).  Any suggestion on what
SOME-CASE-TABLE-CODE should look like?  Alexandre and Elias: Does this
patch give good results?

--8<---------------cut here---------------start------------->8---
--- nnheader.el	01 Aug 2006 12:10:19 +0200	7.24
+++ nnheader.el	07 Nov 2006 15:08:52 +0100	
@@ -997,7 +997,13 @@
 	 (val (symbol-value ffh)))
     (set ffh nil)
     (unwind-protect
-	(apply 'find-file-noselect args)
+	(with-current-buffer (apply 'find-file-noselect args)
+	  (unless (or (featurep 'xemacs)
+		      ;; Better check?
+		      (< emacs-major-version 22))
+	    ;; Apply ASCII-only case-table.  Don't modify the
+	    ;; standard-case-table.
+	    (set-case-table (make-char-table 'case-table))))
       (set ffh val))))
 
 (defun nnheader-directory-regular-files (dir)
--8<---------------cut here---------------end--------------->8---

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Slow operations on buffers of tens of megabytes
  2006-11-07 14:13         ` Reiner Steib
@ 2006-11-08 14:43           ` Reiner Steib
  2006-11-09 22:00             ` Alexandre Oliva
  0 siblings, 1 reply; 20+ messages in thread
From: Reiner Steib @ 2006-11-08 14:43 UTC (permalink / raw)
  Cc: emacs-pretest-bug, Elias Oltmanns, ding

On Tue, Nov 07 2006, Reiner Steib wrote:

> Alexandre and Elias: Does this patch give good results?

Please consider this patch instead:

--8<---------------cut here---------------start------------->8---
--- nnheader.el	01 Aug 2006 12:10:19 +0200	7.24
+++ nnheader.el	08 Nov 2006 15:33:18 +0100	
@@ -997,7 +997,18 @@
 	 (val (symbol-value ffh)))
     (set ffh nil)
     (unwind-protect
-	(apply 'find-file-noselect args)
+	(with-current-buffer (apply 'find-file-noselect args)
+	  (unless (or (featurep 'xemacs)
+		      ;; Better check?
+		      (< emacs-major-version 22))
+	    (nnheader-message 7 "ASCII-only case-table in buffer `%s'."
+			      (current-buffer))
+	    ;; (sit-for 1)
+	    ;; Apply ASCII-only case-table.  Don't modify the
+	    ;; standard-case-table.
+	    (set-case-table (make-char-table 'case-table))
+	    ;; We must return the buffer:
+	    (current-buffer)))
       (set ffh val))))
 
 (defun nnheader-directory-regular-files (dir)
--8<---------------cut here---------------end--------------->8---

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Slow operations on buffers of tens of megabytes
  2006-11-08 14:43           ` Reiner Steib
@ 2006-11-09 22:00             ` Alexandre Oliva
  2006-11-10 18:42               ` Richard Stallman
  2006-11-13 17:28               ` Reiner Steib
  0 siblings, 2 replies; 20+ messages in thread
From: Alexandre Oliva @ 2006-11-09 22:00 UTC (permalink / raw)
  Cc: emacs-pretest-bug, ding

On Nov  8, 2006, Reiner Steib <reinersteib+gmane@imap.cc> wrote:

> On Tue, Nov 07 2006, Reiner Steib wrote:
>> Alexandre and Elias: Does this patch give good results?

> Please consider this patch instead:

Thanks for the patch.  It works, but it doesn't.

It works in that it does speed up entering in new folders.

However, it breaks mail splitting in that, at the time buffers are to
be saved, vc-before-save ends up trying to require vc-RCS, so
newly-split e-mail is not saved, remaining in open modified buffers
until I override vc-before-save and vc-after-save to empty functions
and save them all.  I don't think I lost any e-mail in the process.

I guess the vc backend code could set up some char table that enables
the mapping from RCS to rcs in order to get the correct backend file
name, but should it?

Ultimately, I'm a bit concerned about messing with the case table of
an nnfolder buffer for the entire duration of the buffer.  It's hard
to tell whether there'd be any less visible fallouts.

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Slow operations on buffers of tens of megabytes
  2006-11-09 22:00             ` Alexandre Oliva
@ 2006-11-10 18:42               ` Richard Stallman
  2006-11-11  0:37                 ` Reiner Steib
  2006-11-13 17:28               ` Reiner Steib
  1 sibling, 1 reply; 20+ messages in thread
From: Richard Stallman @ 2006-11-10 18:42 UTC (permalink / raw)
  Cc: emacs-pretest-bug, oltmanns, ding

    It works in that it does speed up entering in new folders.

    However, it breaks mail splitting in that, at the time buffers are to
    be saved, vc-before-save ends up trying to require vc-RCS, so
    newly-split e-mail is not saved, remaining in open modified buffers
    until I override vc-before-save and vc-after-save to empty functions
    and save them all.

That is a bad bug; was this patch installed?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Slow operations on buffers of tens of megabytes
  2006-11-10 18:42               ` Richard Stallman
@ 2006-11-11  0:37                 ` Reiner Steib
  2006-11-13 16:40                   ` Kevin Rodgers
  0 siblings, 1 reply; 20+ messages in thread
From: Reiner Steib @ 2006-11-11  0:37 UTC (permalink / raw)
  Cc: emacs-pretest-bug, Alexandre Oliva, oltmanns, ding

On Fri, Nov 10 2006, Richard Stallman wrote:

>     It works in that it does speed up entering in new folders.
>
>     However, it breaks mail splitting [...]
>
> That is a bad bug; was this patch installed?

No, I didn't install this patch.  I don't use the nnfolder back end of
Gnus, so I can't really test it.

However I think we must do something about this dotless-i/dotted-I
problem because it seems to render Gnus unusable with big (nnfolder)
mailbox files in Emacs 22.

As Elias Oltmanns already has pointed out, the problem is that
nnfolder often does re-searches for "X-Gnus-Article-Number: "
(`nnfolder-article-marker').  Wrapping these calls inside ...

  (with-case-table some-case-table-without-dotless-i/dotted-I
    (re-search-forward nnfolder-article-marker ...))

... might be a possibility.  But there's no `with-case-table' and I
don't know enough about case table to develop a fix that doesn't break
anything else.

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Slow operations on buffers of tens of megabytes
  2006-11-06  9:21     ` Reiner Steib
  2006-11-06 20:00       ` Alexandre Oliva
@ 2006-11-12  5:14       ` Richard Stallman
  1 sibling, 0 replies; 20+ messages in thread
From: Richard Stallman @ 2006-11-12  5:14 UTC (permalink / raw)
  Cc: emacs-pretest-bug, yamaoka, lxoliva, ding

I wrote code to (1) eliminate the peculiar upcasing dotless-i to I,
and downcasing I-with-dot to i in the default case, and (2) make the
Turkish language environment fully set up Turkish case conversion for
all four characters.

I will install it soon.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Slow operations on buffers of tens of megabytes
  2006-11-11  0:37                 ` Reiner Steib
@ 2006-11-13 16:40                   ` Kevin Rodgers
  2006-11-14 12:26                     ` Richard Stallman
  0 siblings, 1 reply; 20+ messages in thread
From: Kevin Rodgers @ 2006-11-13 16:40 UTC (permalink / raw)
  Cc: ding

Reiner Steib wrote:
> As Elias Oltmanns already has pointed out, the problem is that
> nnfolder often does re-searches for "X-Gnus-Article-Number: "
> (`nnfolder-article-marker').  Wrapping these calls inside ...
> 
>   (with-case-table some-case-table-without-dotless-i/dotted-I
>     (re-search-forward nnfolder-article-marker ...))
> 
> ... might be a possibility.  But there's no `with-case-table' and I
> don't know enough about case table to develop a fix that doesn't break
> anything else.

Just copy-and-paste the definition of with-syntax-table from subr.el,
replace `(syntax-table)' with `(current-case-table)', then replace any
remaining occurrences of "syntax" with "case".

-- 
Kevin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Slow operations on buffers of tens of megabytes
  2006-11-09 22:00             ` Alexandre Oliva
  2006-11-10 18:42               ` Richard Stallman
@ 2006-11-13 17:28               ` Reiner Steib
  2006-11-19  9:49                 ` Elias Oltmanns
  1 sibling, 1 reply; 20+ messages in thread
From: Reiner Steib @ 2006-11-13 17:28 UTC (permalink / raw)
  Cc: emacs-pretest-bug, Elias Oltmanns, ding

On Thu, Nov 09 2006, Alexandre Oliva wrote:

> Ultimately, I'm a bit concerned about messing with the case table of
> an nnfolder buffer for the entire duration of the buffer.  It's hard
> to tell whether there'd be any less visible fallouts.

Richard has eliminated the peculiar upcasing dotless-i to I in CVS.
Does it fix your problem?

(IIUC, it should fix it _unless_ the user has a Turkish language
environment.  I.e. Turkish Gnus user's might still suffer from this
problem.)

,----
| 2006-11-12  Richard Stallman  <rms@gnu.org>
| 
| 	* language/european.el (turkish-case-conversion-enable)
| 	(turkish-case-conversion-disable): New functions.
| 	("Turkish" lang env): Use them.
| 
| 	* international/characters.el (case table):
| 	Do nothing special for i and I.
`----

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Slow operations on buffers of tens of megabytes
  2006-11-13 16:40                   ` Kevin Rodgers
@ 2006-11-14 12:26                     ` Richard Stallman
  0 siblings, 0 replies; 20+ messages in thread
From: Richard Stallman @ 2006-11-14 12:26 UTC (permalink / raw)
  Cc: emacs-pretest-bug, ding

This should be fixed in the current sources.  Is it?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Slow operations on buffers of tens of megabytes
  2006-11-13 17:28               ` Reiner Steib
@ 2006-11-19  9:49                 ` Elias Oltmanns
  2006-11-20 12:59                   ` Richard Stallman
  0 siblings, 1 reply; 20+ messages in thread
From: Elias Oltmanns @ 2006-11-19  9:49 UTC (permalink / raw)
  Cc: emacs-pretest-bug

Hi all,

sorry for the delayed response, I've been rather busy these days.

@Reiner: Thanks for Cc-ing me, otherwise I'd most likely have missed
this thread.

Reiner Steib <reinersteib+gmane@imap.cc> wrote:
> On Thu, Nov 09 2006, Alexandre Oliva wrote:
>
>> Ultimately, I'm a bit concerned about messing with the case table of
>> an nnfolder buffer for the entire duration of the buffer.  It's hard
>> to tell whether there'd be any less visible fallouts.
>
> Richard has eliminated the peculiar upcasing dotless-i to I in CVS.
> Does it fix your problem?

Yes, it does. I'm quite pleased that this issue has been settled in
emacs as its a fairly generic one, in my opinion. However, I'm
wondering whether there will be more of these in store after the
switch to the unicode based emacs 23. Without really knowing anything
about emacs 23, I'm just curious to know if something like a generic
mechanism to restrict pattern matching for heavily used functions like
re-search-forward to some limitted case tables, e.g., the ASCII table,
will be provided (if applicable, that is) by emacs.

Regards,

Elias

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Slow operations on buffers of tens of megabytes
  2006-11-19  9:49                 ` Elias Oltmanns
@ 2006-11-20 12:59                   ` Richard Stallman
  2006-11-20 18:22                     ` Elias Oltmanns
  0 siblings, 1 reply; 20+ messages in thread
From: Richard Stallman @ 2006-11-20 12:59 UTC (permalink / raw)
  Cc: emacs-pretest-bug, ding

    Without really knowing anything
    about emacs 23, I'm just curious to know if something like a generic
    mechanism to restrict pattern matching for heavily used functions like
    re-search-forward to some limitted case tables, e.g., the ASCII table,
    will be provided (if applicable, that is) by emacs.

Why would we why we want such a feature?  It is not needed to deal
with this problem, or any other problem I can recall.

Of course, you can always install your own case table.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Slow operations on buffers of tens of megabytes
  2006-11-20 12:59                   ` Richard Stallman
@ 2006-11-20 18:22                     ` Elias Oltmanns
  2006-11-21  7:47                       ` Richard Stallman
  0 siblings, 1 reply; 20+ messages in thread
From: Elias Oltmanns @ 2006-11-20 18:22 UTC (permalink / raw)
  Cc: ding

Richard Stallman <rms@gnu.org> wrote:
>     Without really knowing anything
>     about emacs 23, I'm just curious to know if something like a generic
>     mechanism to restrict pattern matching for heavily used functions like
>     re-search-forward to some limitted case tables, e.g., the ASCII table,
>     will be provided (if applicable, that is) by emacs.
>
> Why would we why we want such a feature?  It is not needed to deal
> with this problem, or any other problem I can recall.

The original problem was that search-forward couldn't use boyer_moore
when gnus scanned the nnfolder for article-nubers. This was due to the
fact that i was related to its dotless version, i.e. a character from
a row further down (or up?) the unicode table. In src/search.c it
says:

--8<---------------cut here---------------start------------->8---
   This kind of search works if all the characters in BASE_PAT that
   have nontrivial translation are the same aside from the last byte.
   This makes it possible to translate just the last byte of a
   character, and do so after just a simple test of the context.
--8<---------------cut here---------------end--------------->8---

As I said, I don't really know anything about emacs 23, nor do I know
very much about unicode for that matter. I just thought that there
might be more such links between characters of different rows in
future case and translation tables. From the point of view of gnus and
similar packages it might be desirable to temporarily use some
restricted translation table (restricted to only characters of one row
in the terms of the quote above) for internal purposes.

>
> Of course, you can always install your own case table.

Maybe, I'm just a bit confused and this will turn out to be the only
feasible solution in the end.

Regards,

Elias

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Slow operations on buffers of tens of megabytes
  2006-11-20 18:22                     ` Elias Oltmanns
@ 2006-11-21  7:47                       ` Richard Stallman
  2006-11-21  8:18                         ` Kenichi Handa
  0 siblings, 1 reply; 20+ messages in thread
From: Richard Stallman @ 2006-11-21  7:47 UTC (permalink / raw)
  Cc: emacs-pretest-bug, ding

    As I said, I don't really know anything about emacs 23, nor do I know
    very much about unicode for that matter. I just thought that there
    might be more such links between characters of different rows in
    future case and translation tables.

Is this in fact the case, for the default case table of Emacs unicode
2 branch, when the change I made for dotless i is installed there?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Slow operations on buffers of tens of megabytes
  2006-11-21  7:47                       ` Richard Stallman
@ 2006-11-21  8:18                         ` Kenichi Handa
  2006-11-22 13:15                           ` Richard Stallman
  0 siblings, 1 reply; 20+ messages in thread
From: Kenichi Handa @ 2006-11-21  8:18 UTC (permalink / raw)
  Cc: emacs-pretest-bug, oltmanns, ding

In article <E1GmQM6-0005qv-AX@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes:

>     As I said, I don't really know anything about emacs 23, nor do I know
>     very much about unicode for that matter. I just thought that there
>     might be more such links between characters of different rows in
>     future case and translation tables.

> Is this in fact the case, for the default case table of Emacs unicode
> 2 branch, when the change I made for dotless i is installed there?

Your change is not yet propagated to emacs-unicode-2 branch.

---
Kenichi Handa
handa@m17n.org

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Slow operations on buffers of tens of megabytes
  2006-11-21  8:18                         ` Kenichi Handa
@ 2006-11-22 13:15                           ` Richard Stallman
  0 siblings, 0 replies; 20+ messages in thread
From: Richard Stallman @ 2006-11-22 13:15 UTC (permalink / raw)
  Cc: oltmanns, emacs-pretest-bug, ding

    > Is this in fact the case, for the default case table of Emacs unicode
    > 2 branch, when the change I made for dotless i is installed there?

    Your change is not yet propagated to emacs-unicode-2 branch.

is there any reason not to propagate it there?
If it will be propagated, that will fix the problem, right?



^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2006-11-22 13:15 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-11-05  5:37 Slow operations on buffers of tens of megabytes Alexandre Oliva
2006-11-06  5:02 ` Richard Stallman
2006-11-06  6:02   ` Katsumi Yamaoka
2006-11-06  9:21     ` Reiner Steib
2006-11-06 20:00       ` Alexandre Oliva
2006-11-07 14:13         ` Reiner Steib
2006-11-08 14:43           ` Reiner Steib
2006-11-09 22:00             ` Alexandre Oliva
2006-11-10 18:42               ` Richard Stallman
2006-11-11  0:37                 ` Reiner Steib
2006-11-13 16:40                   ` Kevin Rodgers
2006-11-14 12:26                     ` Richard Stallman
2006-11-13 17:28               ` Reiner Steib
2006-11-19  9:49                 ` Elias Oltmanns
2006-11-20 12:59                   ` Richard Stallman
2006-11-20 18:22                     ` Elias Oltmanns
2006-11-21  7:47                       ` Richard Stallman
2006-11-21  8:18                         ` Kenichi Handa
2006-11-22 13:15                           ` Richard Stallman
2006-11-12  5:14       ` Richard Stallman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).