Gnus development mailing list
 help / color / mirror / Atom feed
* [PATCH] cvs 2005-09-11: html2text.el::html2text-replace-list updates
@ 2005-09-11  8:19 Jari Aalto
  2005-09-11 18:41 ` Romain Francoise
  0 siblings, 1 reply; 4+ messages in thread
From: Jari Aalto @ 2005-09-11  8:19 UTC (permalink / raw)



Here is small patch to update the list. I left out the "real" Euro and
Cent signs so that the text would be is as close pure ascii as
possible. Patch is against CVS.

2005-09-11  Jari Aalto  <jari dot aalto A T cante dot net>

        * html2text.el: (html2text-replace-list): Added new entities.

Index: lisp/html2text.el
===================================================================
RCS file: /usr/local/cvsroot/gnus/lisp/html2text.el,v
retrieving revision 7.13
diff -u -IId: -b -w -u -r7.13 html2text.el
--- lisp/html2text.el	26 Aug 2005 00:05:02 -0000	7.13
+++ lisp/html2text.el	11 Sep 2005 07:36:20 -0000
@@ -43,8 +43,42 @@
 (defvar html2text-format-single-element-list '(("hr" . html2text-clean-hr)))
 
 (defvar html2text-replace-list
-  '(("&nbsp;" . " ") ("&gt;" . ">") ("&lt;" . "<") ("&quot;" . "\"")
-    ("&amp;" . "&") ("&apos;" . "'"))
+  '(("&acute;" . "`")
+    ("&amp;" . "&")
+    ("&apos;" . "'")
+    ("&brvbar;" . "|")
+    ("&cent;" . "c")
+    ("&circ;" . "^")
+    ("&copy;" . "(C)")
+    ("&curren;" . "(#)")
+    ("&deg;" . "degree")
+    ("&divide;" . "/")
+    ("&euro;" . "e")
+    ("&frac12;" . "1/2")
+    ("&gt;" . ">")
+    ("&iquest;" . "?")
+    ("&laquo;" . "<<")
+    ("&ldquo" . "\"")
+    ("&lsaquo;" . "(")
+    ("&lsquo;" . "`")
+    ("&lt;" . "<")
+    ("&mdash;" . "--")
+    ("&nbsp;" . " ")
+    ("&ndash;" . "-")
+    ("&permil;" . "%%")
+    ("&plusmn;" . "+-")
+    ("&pound;" . "£")
+    ("&quot;" . "\"")
+    ("&raquo;" . ">>")
+    ("&rdquo" . "\"")
+    ("&reg;" . "(R)")
+    ("&rsaquo;" . ")")
+    ("&rsquo;" . "'")
+    ("&sect;" . "§")
+    ("&sup1;" . "^1")
+    ("&sup2;" . "^2")
+    ("&sup3;" . "^3")
+    ("&tilde;" . "~"))
   "The map of entity to text.
 
 This is an alist were each element is a dotted pair consisting of an




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] cvs 2005-09-11: html2text.el::html2text-replace-list updates
  2005-09-11  8:19 [PATCH] cvs 2005-09-11: html2text.el::html2text-replace-list updates Jari Aalto
@ 2005-09-11 18:41 ` Romain Francoise
  2005-09-19 16:21   ` Reiner Steib
  0 siblings, 1 reply; 4+ messages in thread
From: Romain Francoise @ 2005-09-11 18:41 UTC (permalink / raw)
  Cc: ding

Jari Aalto <jari.aalto@cante.net> writes:

> 2005-09-11  Jari Aalto  <jari dot aalto A T cante dot net>
>
>         * html2text.el: (html2text-replace-list): Added new entities.

Applied; thanks!

-- 
Romain Francoise <romain@orebokech.com> | The world is a fine place,
it's a miracle -- http://orebokech.com/ | and worth fighting for.
                                        | --Ernest Hemingway



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] cvs 2005-09-11: html2text.el::html2text-replace-list updates
  2005-09-11 18:41 ` Romain Francoise
@ 2005-09-19 16:21   ` Reiner Steib
  2005-09-19 16:44     ` Romain Francoise
  0 siblings, 1 reply; 4+ messages in thread
From: Reiner Steib @ 2005-09-19 16:21 UTC (permalink / raw)


On Sun, Sep 11 2005, Romain Francoise wrote:

> Jari Aalto <jari.aalto@cante.net> writes:
>
>> 2005-09-11  Jari Aalto  <jari dot aalto A T cante dot net>
>>
>>         * html2text.el: (html2text-replace-list): Added new entities.
>
> Applied; thanks!

I still doubt that maintaining yet another list of entities is a good
idea...

,----[ <news:v9oei6d4mv.fsf@marauder.physik.uni-ulm.de> ]
| From: Reiner Steib <reinersteib+gmane@imap.cc>
| Subject: Re: html2text
| To: <emacs-devel@gnu.org>
| Cc: [...] (Jari Aalto+mail.emacs)
| Date: Tue, 09 Nov 2004 23:44:24 +0100
| Message-ID: <v9oei6d4mv.fsf@marauder.physik.uni-ulm.de>
| 
| [...]
| >  (defvar html2text-replace-list
| > -  '(("&nbsp;" . " ") ("&gt;" . ">") ("&lt;" . "<") ("&quot;" . "\"")
| > -    ("&amp;" . "&") ("&apos;" . "'"))
| > +  '(("&acute;" . "`")
| 
| This should be "´".
| 
| > +    ("&amp;" . "&")
| > +    ("&apos;" . "'")
| > +    ("&brvbar;" . "|")
| > +    ("&cent;" . "c")
| > +    ("&circ;" . "^")
| > +    ("&copy;" . "(C)")
| > +    ("&curren;" . "¤")
| > +    ("&deg;" . "degree")
| > +    ("&divide;" . "/")
| > +    ("&euro;" . "e")
| > +    ("&frac12;" . "½")
| [...]
| 
| It seems strange to use Latin-1 characters for some entities, but not
| for all encodable by Latin-1.
| 
| On a second thought, it looks like there are already more or less
| complete lists[1] e.g. in `mm-url-html-entities' (from Gnus),
| `sgml-char-names', `sgml-char-names-table', `iso-iso2sgml-trans-tab'
| (Emacs) or `w3m-entity-alist' (emacs-w3m).
| 
| Probably one of these could be used.  Hm, maybe the function
| `iso-sgml2iso' could be used in `html2text.el'?
| 
| Bye, Reiner.
| 
| [1] Might be checked with
|     http://www.w3.org/TR/REC-html40/sgml/entities.html or other
|     tables.
`----

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] cvs 2005-09-11: html2text.el::html2text-replace-list updates
  2005-09-19 16:21   ` Reiner Steib
@ 2005-09-19 16:44     ` Romain Francoise
  0 siblings, 0 replies; 4+ messages in thread
From: Romain Francoise @ 2005-09-19 16:44 UTC (permalink / raw)


Reiner Steib <reinersteib+gmane@imap.cc> writes:

> I still doubt that maintaining yet another list of entities is a good
> idea...

Sure, but in the meantime updating the list doesn't cost anything.

-- 
Romain Francoise <romain@orebokech.com> | I like the streets when
it's a miracle -- http://orebokech.com/ | they're empty, I can make the
                                        | rest up.



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2005-09-19 16:44 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-09-11  8:19 [PATCH] cvs 2005-09-11: html2text.el::html2text-replace-list updates Jari Aalto
2005-09-11 18:41 ` Romain Francoise
2005-09-19 16:21   ` Reiner Steib
2005-09-19 16:44     ` Romain Francoise

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).