* Re: sanitized mm-string-to-multibyte [not found] <8763jkg7cu.fsf@liv.ac.uk> @ 2009-02-09 10:28 ` Katsumi Yamaoka 2009-02-09 23:27 ` Dave Love 0 siblings, 1 reply; 3+ messages in thread From: Katsumi Yamaoka @ 2009-02-09 10:28 UTC (permalink / raw) To: Dave Love; +Cc: bugs, ding >>>>> Dave Love wrote: > I found some IMAP messages were crashing Emacs, and I was led to > mm-string-to-multibyte. I'm not sure exactly what the crash was due to, > but the function isn't very sane in Emacs 21. This version doesn't cons > a string for each character. Various uses of the function are at least > dubious, and I'll send patches later. mm-with-preserved-unibyte is > useful for those changes and elsewhere. > 2009-02-08 Dave Love <fx@gnu.org> > * mm-util.el (mm-identity-nat, mm-with-preserved-unibyte): New. > (mm-string-to-multibyte): Use them. Your version of `mm-string-to-multibyte' doesn't seem to convert a unibyte string to a multibyte string. In Emacs 21.1~21.4 I got: (let* ((s1 (string-as-unibyte "a")) (s2 (mm-with-preserved-unibyte (string-make-multibyte s1)))) (list (multibyte-string-p s1) (multibyte-string-p s2))) => (nil nil) (let* ((s1 (string-as-multibyte "a")) (s2 (mm-with-preserved-unibyte (string-make-multibyte s1)))) (list (multibyte-string-p s1) (multibyte-string-p s2))) => (t t) Did I miss something? > Index: mm-util.el > =================================================================== > RCS file: /usr/local/cvsroot/gnus/lisp/mm-util.el,v > retrieving revision 7.91 > diff -u -r7.91 mm-util.el > --- mm-util.el 14 Jan 2009 00:52:01 -0000 7.91 > +++ mm-util.el 8 Feb 2009 17:27:12 -0000 > @@ -202,6 +202,22 @@ > (defalias 'mm-decode-coding-region 'decode-coding-region) > (defalias 'mm-encode-coding-region 'encode-coding-region))) > +(defconst mm-identity-nat (let (l) > + (dotimes (i 256) > + (push (cons i i) l)) > + (make-translation-table l)) > + "Translation table that applies the identity trasnlation.") > + > +(defmacro mm-with-preserved-unibyte (&rest body) > + "Execute BODY forms while preserving unibyte characters. > +Such characters are not converted automatically to multibyte ones > +when, for instance, inserted into a multibyte buffer within the > +BODY forms." > + `(let ((nonascii-translation-table mm-identity-nat)) > + ,@body)) > +(put 'mm-with-preserved-unibyte 'lisp-indent-function 0) > +(put 'mm-with-preserved-unibyte 'edebug-form-spec '(body)) > + > ;; `string-to-multibyte' is available only in Emacs 22.1 or greater. > (defalias 'mm-string-to-multibyte > (cond > @@ -210,11 +226,8 @@ > ((fboundp 'string-to-multibyte) > 'string-to-multibyte) > (t > - (lambda (string) > - "Return a multibyte string with the same individual chars as STRING." > - (mapconcat > - (lambda (ch) (mm-string-as-multibyte (char-to-string ch))) > - string ""))))) > + (lambda (s) > + (mm-with-preserved-unibyte (string-make-multibyte s)))))) > ;; `char-or-char-int-p' is an XEmacs function, not available in Emacs. > (eval-and-compile ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: sanitized mm-string-to-multibyte 2009-02-09 10:28 ` sanitized mm-string-to-multibyte Katsumi Yamaoka @ 2009-02-09 23:27 ` Dave Love 2009-02-10 0:12 ` Katsumi Yamaoka 0 siblings, 1 reply; 3+ messages in thread From: Dave Love @ 2009-02-09 23:27 UTC (permalink / raw) To: Katsumi Yamaoka; +Cc: bugs, ding Katsumi Yamaoka <yamaoka@jpl.org> writes: > Your version of `mm-string-to-multibyte' doesn't seem to convert > a unibyte string to a multibyte string. In Emacs 21.1~21.4 I got: > > (let* ((s1 (string-as-unibyte "a")) > (s2 (mm-with-preserved-unibyte (string-make-multibyte s1)))) > (list (multibyte-string-p s1) (multibyte-string-p s2))) > => (nil nil) > > (let* ((s1 (string-as-multibyte "a")) > (s2 (mm-with-preserved-unibyte (string-make-multibyte s1)))) > (list (multibyte-string-p s1) (multibyte-string-p s2))) > => (t t) > > Did I miss something? I think it doesn't cons a new string in the trivial case like that, when it won't matter. Use this version if you want always to cons a multibyte string. Index: mm-util.el =================================================================== RCS file: /usr/local/cvsroot/gnus/lisp/mm-util.el,v retrieving revision 7.91 diff -u -r7.91 mm-util.el --- mm-util.el 14 Jan 2009 00:52:01 -0000 7.91 +++ mm-util.el 9 Feb 2009 23:26:20 -0000 @@ -202,6 +202,22 @@ (defalias 'mm-decode-coding-region 'decode-coding-region) (defalias 'mm-encode-coding-region 'encode-coding-region))) +(defconst mm-identity-nat (let (l) + (dotimes (i 256) + (push (cons i i) l)) + (make-translation-table l)) + "Non-ASCII translation table that applies the identity translation.") + +(defmacro mm-with-preserved-unibyte (&rest body) + "Execute BODY forms while preserving unibyte characters. +Such characters are not converted automatically to multibyte ones +when, for instance, inserted into a multibyte buffer within the +BODY forms." + `(let ((nonascii-translation-table mm-identity-nat)) + ,@body)) +(put 'mm-with-preserved-unibyte 'lisp-indent-function 0) +(put 'mm-with-preserved-unibyte 'edebug-form-spec '(body)) + ;; `string-to-multibyte' is available only in Emacs 22.1 or greater. (defalias 'mm-string-to-multibyte (cond @@ -210,11 +226,9 @@ ((fboundp 'string-to-multibyte) 'string-to-multibyte) (t - (lambda (string) - "Return a multibyte string with the same individual chars as STRING." - (mapconcat - (lambda (ch) (mm-string-as-multibyte (char-to-string ch))) - string ""))))) + (lambda (s) + (mm-with-preserved-unibyte + (concat s (eval-when-compile (string-as-multibyte "")))))))) ;; `char-or-char-int-p' is an XEmacs function, not available in Emacs. (eval-and-compile ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: sanitized mm-string-to-multibyte 2009-02-09 23:27 ` Dave Love @ 2009-02-10 0:12 ` Katsumi Yamaoka 0 siblings, 0 replies; 3+ messages in thread From: Katsumi Yamaoka @ 2009-02-10 0:12 UTC (permalink / raw) To: Dave Love; +Cc: bugs, ding >>>>> Dave Love wrote: > Katsumi Yamaoka <yamaoka@jpl.org> writes: >> Your version of `mm-string-to-multibyte' doesn't seem to convert >> a unibyte string to a multibyte string. In Emacs 21.1~21.4 I got: >> >> (let* ((s1 (string-as-unibyte "a")) >> (s2 (mm-with-preserved-unibyte (string-make-multibyte s1)))) >> (list (multibyte-string-p s1) (multibyte-string-p s2))) >> => (nil nil) >> >> (let* ((s1 (string-as-multibyte "a")) >> (s2 (mm-with-preserved-unibyte (string-make-multibyte s1)))) >> (list (multibyte-string-p s1) (multibyte-string-p s2))) >> => (t t) >> >> Did I miss something? > I think it doesn't cons a new string in the trivial case like that, when > it won't matter. Use this version if you want always to cons a > multibyte string. I verified how (eval-when-compile (string-as-multibyte "")) behaves with this test.el file: (defun test () (let ((s1 "") (s2 (eval-when-compile (string-as-multibyte ""))) (s3 (eval-when-compile (string-as-unibyte "")))) (message "%s %s %s" (multibyte-string-p s1) (multibyte-string-p s2) (multibyte-string-p s3)))) (test) $ emacs-21.4 -batch -q -l ./test.el => nil t nil $ emacs-21.4 -batch -q -f batch-byte-compile ./test.el $ emacs-21.4 -batch -q -l ./test.elc => nil nil nil So there seems to be no difference in those ""s in the byte compiled file. > Index: mm-util.el > =================================================================== > RCS file: /usr/local/cvsroot/gnus/lisp/mm-util.el,v > retrieving revision 7.91 > diff -u -r7.91 mm-util.el > --- mm-util.el 14 Jan 2009 00:52:01 -0000 7.91 > +++ mm-util.el 9 Feb 2009 23:26:20 -0000 > @@ -202,6 +202,22 @@ > (defalias 'mm-decode-coding-region 'decode-coding-region) > (defalias 'mm-encode-coding-region 'encode-coding-region))) > +(defconst mm-identity-nat (let (l) > + (dotimes (i 256) > + (push (cons i i) l)) > + (make-translation-table l)) > + "Non-ASCII translation table that applies the identity translation.") > + > +(defmacro mm-with-preserved-unibyte (&rest body) > + "Execute BODY forms while preserving unibyte characters. > +Such characters are not converted automatically to multibyte ones > +when, for instance, inserted into a multibyte buffer within the > +BODY forms." > + `(let ((nonascii-translation-table mm-identity-nat)) > + ,@body)) > +(put 'mm-with-preserved-unibyte 'lisp-indent-function 0) > +(put 'mm-with-preserved-unibyte 'edebug-form-spec '(body)) > + > ;; `string-to-multibyte' is available only in Emacs 22.1 or greater. > (defalias 'mm-string-to-multibyte > (cond > @@ -210,11 +226,9 @@ > ((fboundp 'string-to-multibyte) > 'string-to-multibyte) > (t > - (lambda (string) > - "Return a multibyte string with the same individual chars as STRING." > - (mapconcat > - (lambda (ch) (mm-string-as-multibyte (char-to-string ch))) > - string ""))))) > + (lambda (s) > + (mm-with-preserved-unibyte > + (concat s (eval-when-compile (string-as-multibyte "")))))))) > ;; `char-or-char-int-p' is an XEmacs function, not available in Emacs. > (eval-and-compile ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2009-02-10 0:12 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <8763jkg7cu.fsf@liv.ac.uk> 2009-02-09 10:28 ` sanitized mm-string-to-multibyte Katsumi Yamaoka 2009-02-09 23:27 ` Dave Love 2009-02-10 0:12 ` Katsumi Yamaoka
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).