Announcements and discussions for Gnus, the GNU Emacs Usenet newsreader
 help / color / mirror / Atom feed
From: Aidan Kehoe <kehoea@parhasard.net>
Subject: XEmacs, Gnus and mm-coding-system-priorities.
Date: Thu, 02 Dec 2004 11:59:01 +0000	[thread overview]
Message-ID: <16815.901.778441.575923.z25zdq@parhasard.net> (raw)


Hi, 

Further to my message of the 26th, lvmzx4a1vb.fsf@ns5.nestdesign.com , I’ve
made a patch to mm-util.el that takes advantage of Stephen Turnbull’s Latin
Unity to remap messages’ characters and to take notice of the
mm-coding-system-priorities variable under XEmacs. 

With this patch applied, and with latin-unity available,

     (setq mm-coding-system-priorities '(iso-8859-1 iso-8859-15 utf-8))

in your init file tells Gnus to post in Latin 1 if the message fits into
Latin 1--including if, say, Latin 2 U WITH DIAERESIS is used--iso-8859-15 if
the message fits into that but not Latin 1, and UTF-8 if neither of those
things is true. This is much preferable to the current behaviour.

Tested under 21.4, 21.5 and the stable GNU Emacs. If you have trouble
applying the patch from a news article, there’s a plain text version
available at
http://parhasard.net/mm-util-xemacs-coding-system-priorities.diff .

What do I have to do get this included in the standard Gnus?

Best regards, 

        - Aidan
-- 
“As democracy is perfected, the office of president represents, more and
more closely, the inner soul of the people. On some great and glorious day
the plain folks of the land will reach their heart’s desire at last and the
White House will be adorned by a downright moron.” – H.L. Mencken 

--- mm-util.el~	2004-12-02 10:43:08.000000000 +0000
+++ mm-util.el	2004-12-02 11:42:03.000000000 +0000
@@ -587,11 +587,68 @@
 	       charsets))
 	;; Otherwise we're not multibyte, we're XEmacs, or a single
 	;; coding system won't cover it.
-	(setq charsets
-	      (mm-delete-duplicates
-	       (mapcar 'mm-mime-charset
-		       (delq 'ascii
-			     (mm-find-charset-region b e))))))
+
+	;; For intelligent handling of the various ISO-8859-? character sets
+	;; and their common subsets under XEmacs, we use latin-unity.
+	(when (and (not (featurep 'latin-unity))
+		   (locate-library "latin-unity"))
+	  (require 'latin-unity))
+
+	(if (featurep 'latin-unity)
+	    (let ((csets (latin-unity-representations-feasible-region b e))
+		  (psets (latin-unity-representations-present-region b e))
+		  (systems mm-coding-system-priorities)
+		  (chars-region (delq 'ascii (charsets-in-region b e))) curset)
+
+	      (assert (featurep 'xemacs) t 
+		      "Latin Unity shouldn't be available on GNU Emacs.")
+
+	      (setq charsets
+		    (catch 'done
+
+		       ;; Check whether all Latin Unity knows about all the
+		       ;; character sets in the region. If it doesn't, and we
+		       ;; have a universal coding system in the
+		       ;; mm-coding-system-priorities list, return that
+		       ;; universal coding system. Otherwise, we can't do the
+		       ;; right thing; return a multiple-entry list, so Gnus
+		       ;; will do its broken thing.
+
+		       (dolist (curset chars-region)
+			 (unless (memq curset latin-unity-character-sets)
+			   (dolist (curset systems)
+			     (if (memq curset latin-unity-ucs-list)
+				 (throw 'done (list curset))))
+			   (throw 'done (mapcar 'mm-mime-charset
+						(delq 'ascii
+						      (charsets-in-region
+						       b e))))))
+
+		       ;; Okay, Latin Unity does know all about the
+		       ;; character sets in the region. Pass back the first
+		       ;; coding system in the preferred list that can
+		       ;; encode the whole buffer.
+
+		       (dolist (curset systems)
+			 (setq curset 
+			       (latin-unity-massage-name curset 
+							 'buffer-default))
+			 (if (memq curset latin-unity-ucs-list)
+			     (throw 'done (list curset)))
+			 (if (latin-unity-maybe-remap b e curset csets psets t)
+			     (throw 'done (list curset))))
+
+		       ;; Can't encode using anything from the
+		       ;; mm-coding-system-priorities list. Return a
+		       ;; multiple entry list.
+		       (mapcar 'mm-mime-charset 
+			       (delq 'ascii (charsets-in-region b e))))))
+	  ;; Otherwise, there's nothing really intelligent we can do with
+	  ;; the characters.
+	  (setq charsets
+		(mm-delete-duplicates 
+		 (mapcar 'mm-mime-charset 
+			 (delq 'ascii (mm-find-charset-region b e)))))))
     (if (and (> (length charsets) 1)
 	     (memq 'iso-8859-15 charsets)
 	     (memq 'iso-8859-15 hack-charsets)


             reply	other threads:[~2004-12-02 11:59 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-12-02 11:59 Aidan Kehoe [this message]
     [not found] ` <v9oehcoozg.fsf@marauder.physik.uni-ulm.de>
     [not found]   ` <16815.6410.553114.367045.z25zdq@parhasard.net>
2004-12-02 19:35     ` Reiner Steib
2004-12-03  1:21     ` Stephen J. Turnbull

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=16815.901.778441.575923.z25zdq@parhasard.net \
    --to=kehoea@parhasard.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).