From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.user/4278
Path: news.gmane.org!not-for-mail
From: Aidan Kehoe <kehoea@parhasard.net>
Newsgroups: gmane.emacs.gnus.user
Subject: XEmacs, Gnus and mm-coding-system-priorities.
Date: Thu, 02 Dec 2004 11:59:01 +0000
Message-ID: <16815.901.778441.575923.z25zdq@parhasard.net>
NNTP-Posting-Host: main.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Trace: sea.gmane.org 1138670243 22602 80.91.229.2 (31 Jan 2006 01:17:23 GMT)
X-Complaints-To: usenet@sea.gmane.org
NNTP-Posting-Date: Tue, 31 Jan 2006 01:17:23 +0000 (UTC)
Original-X-From: nobody Tue Jan 17 17:33:31 2006
Original-Path: quimby.gnus.org!newsfeed1.e.nsc.no!uio.no!feed.news.tiscali.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
Original-Newsgroups: gnu.emacs.gnus,comp.emacs.xemacs
Original-X-Trace: individual.net jhpJO/FF8BZRMigu7NWGtQ+sBxloPjm28bDGU6uDpspE9k07y/
User-Agent: Gnus/5.1002 (Gnus v5.10.2) XEmacs/21.4 (Rational FORTRAN, linux)
Cancel-Lock: sha1:zlOLx050Mc+xxxPxnn0+jPB8Xvk=
Original-Xref: bridgekeeper.physik.uni-ulm.de gnus-emacs-gnus:4419
Original-Lines: 109
X-Gnus-Article-Number: 4419   Tue Jan 17 17:33:31 2006
Xref: news.gmane.org gmane.emacs.gnus.user:4278
Archived-At: <http://permalink.gmane.org/gmane.emacs.gnus.user/4278>


Hi, 

Further to my message of the 26th, lvmzx4a1vb.fsf@ns5.nestdesign.com , I’ve
made a patch to mm-util.el that takes advantage of Stephen Turnbull’s Latin
Unity to remap messages’ characters and to take notice of the
mm-coding-system-priorities variable under XEmacs. 

With this patch applied, and with latin-unity available,

     (setq mm-coding-system-priorities '(iso-8859-1 iso-8859-15 utf-8))

in your init file tells Gnus to post in Latin 1 if the message fits into
Latin 1--including if, say, Latin 2 U WITH DIAERESIS is used--iso-8859-15 if
the message fits into that but not Latin 1, and UTF-8 if neither of those
things is true. This is much preferable to the current behaviour.

Tested under 21.4, 21.5 and the stable GNU Emacs. If you have trouble
applying the patch from a news article, there’s a plain text version
available at
http://parhasard.net/mm-util-xemacs-coding-system-priorities.diff .

What do I have to do get this included in the standard Gnus?

Best regards, 

        - Aidan
-- 
“As democracy is perfected, the office of president represents, more and
more closely, the inner soul of the people. On some great and glorious day
the plain folks of the land will reach their heart’s desire at last and the
White House will be adorned by a downright moron.” – H.L. Mencken 

--- mm-util.el~	2004-12-02 10:43:08.000000000 +0000
+++ mm-util.el	2004-12-02 11:42:03.000000000 +0000
@@ -587,11 +587,68 @@
 	       charsets))
 	;; Otherwise we're not multibyte, we're XEmacs, or a single
 	;; coding system won't cover it.
-	(setq charsets
-	      (mm-delete-duplicates
-	       (mapcar 'mm-mime-charset
-		       (delq 'ascii
-			     (mm-find-charset-region b e))))))
+
+	;; For intelligent handling of the various ISO-8859-? character sets
+	;; and their common subsets under XEmacs, we use latin-unity.
+	(when (and (not (featurep 'latin-unity))
+		   (locate-library "latin-unity"))
+	  (require 'latin-unity))
+
+	(if (featurep 'latin-unity)
+	    (let ((csets (latin-unity-representations-feasible-region b e))
+		  (psets (latin-unity-representations-present-region b e))
+		  (systems mm-coding-system-priorities)
+		  (chars-region (delq 'ascii (charsets-in-region b e))) curset)
+
+	      (assert (featurep 'xemacs) t 
+		      "Latin Unity shouldn't be available on GNU Emacs.")
+
+	      (setq charsets
+		    (catch 'done
+
+		       ;; Check whether all Latin Unity knows about all the
+		       ;; character sets in the region. If it doesn't, and we
+		       ;; have a universal coding system in the
+		       ;; mm-coding-system-priorities list, return that
+		       ;; universal coding system. Otherwise, we can't do the
+		       ;; right thing; return a multiple-entry list, so Gnus
+		       ;; will do its broken thing.
+
+		       (dolist (curset chars-region)
+			 (unless (memq curset latin-unity-character-sets)
+			   (dolist (curset systems)
+			     (if (memq curset latin-unity-ucs-list)
+				 (throw 'done (list curset))))
+			   (throw 'done (mapcar 'mm-mime-charset
+						(delq 'ascii
+						      (charsets-in-region
+						       b e))))))
+
+		       ;; Okay, Latin Unity does know all about the
+		       ;; character sets in the region. Pass back the first
+		       ;; coding system in the preferred list that can
+		       ;; encode the whole buffer.
+
+		       (dolist (curset systems)
+			 (setq curset 
+			       (latin-unity-massage-name curset 
+							 'buffer-default))
+			 (if (memq curset latin-unity-ucs-list)
+			     (throw 'done (list curset)))
+			 (if (latin-unity-maybe-remap b e curset csets psets t)
+			     (throw 'done (list curset))))
+
+		       ;; Can't encode using anything from the
+		       ;; mm-coding-system-priorities list. Return a
+		       ;; multiple entry list.
+		       (mapcar 'mm-mime-charset 
+			       (delq 'ascii (charsets-in-region b e))))))
+	  ;; Otherwise, there's nothing really intelligent we can do with
+	  ;; the characters.
+	  (setq charsets
+		(mm-delete-duplicates 
+		 (mapcar 'mm-mime-charset 
+			 (delq 'ascii (mm-find-charset-region b e)))))))
     (if (and (> (length charsets) 1)
 	     (memq 'iso-8859-15 charsets)
 	     (memq 'iso-8859-15 hack-charsets)