From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.user/4278 Path: news.gmane.org!not-for-mail From: Aidan Kehoe Newsgroups: gmane.emacs.gnus.user Subject: XEmacs, Gnus and mm-coding-system-priorities. Date: Thu, 02 Dec 2004 11:59:01 +0000 Message-ID: <16815.901.778441.575923.z25zdq@parhasard.net> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: sea.gmane.org 1138670243 22602 80.91.229.2 (31 Jan 2006 01:17:23 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Tue, 31 Jan 2006 01:17:23 +0000 (UTC) Original-X-From: nobody Tue Jan 17 17:33:31 2006 Original-Path: quimby.gnus.org!newsfeed1.e.nsc.no!uio.no!feed.news.tiscali.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail Original-Newsgroups: gnu.emacs.gnus,comp.emacs.xemacs Original-X-Trace: individual.net jhpJO/FF8BZRMigu7NWGtQ+sBxloPjm28bDGU6uDpspE9k07y/ User-Agent: Gnus/5.1002 (Gnus v5.10.2) XEmacs/21.4 (Rational FORTRAN, linux) Cancel-Lock: sha1:zlOLx050Mc+xxxPxnn0+jPB8Xvk= Original-Xref: bridgekeeper.physik.uni-ulm.de gnus-emacs-gnus:4419 Original-Lines: 109 X-Gnus-Article-Number: 4419 Tue Jan 17 17:33:31 2006 Xref: news.gmane.org gmane.emacs.gnus.user:4278 Archived-At: Hi, Further to my message of the 26th, lvmzx4a1vb.fsf@ns5.nestdesign.com , I’ve made a patch to mm-util.el that takes advantage of Stephen Turnbull’s Latin Unity to remap messages’ characters and to take notice of the mm-coding-system-priorities variable under XEmacs. With this patch applied, and with latin-unity available, (setq mm-coding-system-priorities '(iso-8859-1 iso-8859-15 utf-8)) in your init file tells Gnus to post in Latin 1 if the message fits into Latin 1--including if, say, Latin 2 U WITH DIAERESIS is used--iso-8859-15 if the message fits into that but not Latin 1, and UTF-8 if neither of those things is true. This is much preferable to the current behaviour. Tested under 21.4, 21.5 and the stable GNU Emacs. If you have trouble applying the patch from a news article, there’s a plain text version available at http://parhasard.net/mm-util-xemacs-coding-system-priorities.diff . What do I have to do get this included in the standard Gnus? Best regards, - Aidan -- “As democracy is perfected, the office of president represents, more and more closely, the inner soul of the people. On some great and glorious day the plain folks of the land will reach their heart’s desire at last and the White House will be adorned by a downright moron.” – H.L. Mencken --- mm-util.el~ 2004-12-02 10:43:08.000000000 +0000 +++ mm-util.el 2004-12-02 11:42:03.000000000 +0000 @@ -587,11 +587,68 @@ charsets)) ;; Otherwise we're not multibyte, we're XEmacs, or a single ;; coding system won't cover it. - (setq charsets - (mm-delete-duplicates - (mapcar 'mm-mime-charset - (delq 'ascii - (mm-find-charset-region b e)))))) + + ;; For intelligent handling of the various ISO-8859-? character sets + ;; and their common subsets under XEmacs, we use latin-unity. + (when (and (not (featurep 'latin-unity)) + (locate-library "latin-unity")) + (require 'latin-unity)) + + (if (featurep 'latin-unity) + (let ((csets (latin-unity-representations-feasible-region b e)) + (psets (latin-unity-representations-present-region b e)) + (systems mm-coding-system-priorities) + (chars-region (delq 'ascii (charsets-in-region b e))) curset) + + (assert (featurep 'xemacs) t + "Latin Unity shouldn't be available on GNU Emacs.") + + (setq charsets + (catch 'done + + ;; Check whether all Latin Unity knows about all the + ;; character sets in the region. If it doesn't, and we + ;; have a universal coding system in the + ;; mm-coding-system-priorities list, return that + ;; universal coding system. Otherwise, we can't do the + ;; right thing; return a multiple-entry list, so Gnus + ;; will do its broken thing. + + (dolist (curset chars-region) + (unless (memq curset latin-unity-character-sets) + (dolist (curset systems) + (if (memq curset latin-unity-ucs-list) + (throw 'done (list curset)))) + (throw 'done (mapcar 'mm-mime-charset + (delq 'ascii + (charsets-in-region + b e)))))) + + ;; Okay, Latin Unity does know all about the + ;; character sets in the region. Pass back the first + ;; coding system in the preferred list that can + ;; encode the whole buffer. + + (dolist (curset systems) + (setq curset + (latin-unity-massage-name curset + 'buffer-default)) + (if (memq curset latin-unity-ucs-list) + (throw 'done (list curset))) + (if (latin-unity-maybe-remap b e curset csets psets t) + (throw 'done (list curset)))) + + ;; Can't encode using anything from the + ;; mm-coding-system-priorities list. Return a + ;; multiple entry list. + (mapcar 'mm-mime-charset + (delq 'ascii (charsets-in-region b e)))))) + ;; Otherwise, there's nothing really intelligent we can do with + ;; the characters. + (setq charsets + (mm-delete-duplicates + (mapcar 'mm-mime-charset + (delq 'ascii (mm-find-charset-region b e))))))) (if (and (> (length charsets) 1) (memq 'iso-8859-15 charsets) (memq 'iso-8859-15 hack-charsets)